For developers building AI-visibility tools

The LLM API aggregator built for brand and citation tracking

An LLM API aggregator is a single endpoint that proxies multiple model providers, normalizes their responses into a common shape, and bills against one wallet. MentionsAPI aggregates ChatGPT, Claude, Gemini, and Perplexity behind `/v1/check`, and unlike pure-routing aggregators, it adds citation deduplication, brand extraction, and a 24-hour shared cache as part of the normalization layer.

An LLM aggregator that just returns four blocks of text isn't doing the hard part. The hard part is normalization. Pulling citations out of Perplexity's footnote format, parsing Gemini's JSON-when-it-feels-like-it response, and turning all of it into a comparable structure.

MentionsAPI is an aggregator that does the normalization. Citations are extracted, deduplicated, and resolved to canonical URLs. Brand mentions are tagged with position and sentiment. Models tell you which exact version answered.

It's an aggregator built by people who've lost a Sunday to Perplexity's citation parser, so you don't have to.

Top up from $10 · Pay per call · Credits never expire

Pay as you go·$10 minimum · Credits never expire · No plans

Normalized citations across providers

Each LLM cites sources differently. Perplexity uses inline footnote markers, ChatGPT (with browsing) uses a separate sources array, and Claude with web search uses XML tags. We normalize all of them into a single `citations` array per provider, with `url`, `domain`, `title`, and `snippet`.

We also resolve redirects and strip tracking parameters, so the URLs you get are canonical and dedupe-friendly. If three providers all cite the same article via three different short links, you see one entry per provider and a clean source URL.

Deduplication happens at two levels. Per-provider, a citation that the model lists twice (common in Perplexity's longer answers) collapses into one entry with a `count` field. Cross-provider, the top-level `citations[]` array merges identical canonical URLs and exposes a `providers_cited[]` so you can answer 'which engines all cited this article?' without joining four arrays yourself.

Brand mention tagging

Pass an array of brand names in `track_brands`. Each provider's response is scanned for those mentions, with case-insensitive matching, common alias handling (e.g., 'OpenAI' / 'Open AI'), and position metadata so you can render highlights in your UI without re-parsing.

Sentiment scoring (`positive`, `neutral`, `negative`) runs on the surrounding sentence, not the full answer, so brand mentions in different sections get accurate per-mention sentiment.

Fuzzy matching uses a Levenshtein-≤2 distance threshold over the normalized brand string, which catches the realistic typos and casing variations LLMs introduce ('Cloudfare' for 'Cloudflare', 'Postgres' for 'PostgreSQL') without hallucinating matches on unrelated tokens. The match scope is reported in the response so your UI can show the user exactly which substring matched, not just a yes/no boolean.

One bill, one rate limit

Tracking your spend across four providers is a part-time accounting job. With MentionsAPI, every call deducts from one wallet balance. One line on your Stripe receipts, one rate limit, and you stop being surprised by Anthropic's quarterly token re-pricing.

The wallet model is also why the aggregator is cheaper than DIY for most repeat workloads. Direct provider billing is per-token and varies by model tier; MentionsAPI bills $0.02 on cache hits and $0.25 on full multi-provider fan-outs, regardless of underlying token usage. For agency tools running the same prompt list against 50-200 client brands, cache hit rates of 70-80% on second-and-later runs are typical. That's the gap between a $40 monthly bill and a $400 one.

How aggregation and deduplication work under the hood

An incoming `/v1/check` request hits a normalization pipeline with three stages: dispatch, parse, and merge. Dispatch fans out to the chosen providers in parallel via per-provider adapters that handle authentication, request shape, and upstream rate limit signals. Parse runs four format-specific extractors that pull each provider's citations into the common `{url, domain, title, snippet}` shape; Perplexity's inline `[1]` markers get rewritten out of `text`, ChatGPT's tool-call sources get unnested, Claude's `<citation>` XML gets parsed, and Gemini's grounding metadata gets unrolled.

Merge runs two passes: a URL canonicalizer that resolves common shorteners (`t.co`, `news.google.com`, AMP redirects) and strips tracker parameters (`utm_*`, `fbclid`, `gclid`, `mc_eid`), then a dedupe pass that groups by canonical URL. The output is the per-provider `results[i].citations[]` plus a top-level `citations[]` with `providers_cited[]`, so you can render either shape without re-joining.

Latency profile: the merge step adds roughly 30-80 ms over the raw fan-out latency, dominated by the redirect resolver (which we cache aggressively). For cached requests this overhead is moot. The entire normalized response is stored verbatim and returns in ~140 ms. For uncached `mode: quick` requests, total median latency lands at 3-4 s; the slowest provider is the long pole, not our parsing.

Edge cases worth knowing: when two providers cite different canonical URLs that redirect to the same final destination, we merge them on the resolved canonical, not the source URL. When one provider cites a URL that 4xxs at resolution time, we keep the original URL with a `resolved: false` flag rather than dropping it. Partial information beats silent loss.

When to use an aggregator (and when to build your own)

Use the aggregator when you need cross-provider normalized data and you don't want to maintain four citation parsers. The clearest fit is GEO tools, multi-LLM brand monitoring, citation-attribution dashboards, and any internal tool where the same prompt needs to run across providers and the results need to land in one schema. Time-to-value is hours, not engineer-months.

Use it specifically over LiteLLM or OpenRouter when you need extraction layered into normalization. Those tools are routing/proxy layers and will happily give you raw provider responses, but you'll still write the citation parsers and brand extractors yourself. If your project's hard part is parsing, this is the aggregator. If it's only routing one call to one model, OpenRouter is leaner.

Don't use it if you want a self-hosted, in-VPC aggregator with full control over the upstream key store. LiteLLM is open source and can sit inside your network; we run it as a managed service. Enterprise customers with strict data-residency requirements should email [email protected]. We have a deployment story but it's not the headline product.

FAQ

Frequently asked questions

Answer-first, dev-to-dev. Each one is also embedded as FAQPage schema for AI engines.

What is an LLM API aggregator?
An LLM API aggregator is a single endpoint that proxies multiple language-model providers, normalizes their responses into a common schema, and bills against one wallet. MentionsAPI is an aggregator built specifically for cross-provider analytics. Citations get deduplicated, brands get extracted, and every provider returns the same fields, so you can compare answers without writing four parsers.
How does MentionsAPI deduplicate citations across providers?
Every cited URL is run through a server-side canonicalizer that resolves common shorteners (`t.co`, `news.google.com`, AMP redirects) and strips tracker params (`utm_*`, `fbclid`, `gclid`). The deduped output ships as a top-level `citations[]` with a `providers_cited[]` per URL, so you can answer 'which providers cited this URL' without joining four arrays.
Is MentionsAPI different from OpenRouter?
OpenRouter routes one request to one provider at a time and bills per-token; MentionsAPI fans out to multiple providers in parallel and adds normalization, brand extraction, and a 24-hour shared cache on top. If you want to pick the cheapest provider for a single call, OpenRouter is the right tool. If you want side-by-side answers across providers with structured outputs, MentionsAPI is the right tool.
Can I self-host MentionsAPI like LiteLLM?
Not currently. We ship as a managed service on Cloudflare Workers, with the upstream provider keys held server-side. Self-host or in-VPC deployment is on the Enterprise roadmap; if you have data-residency requirements, email [email protected] to discuss a private deployment. LiteLLM is the right open-source aggregator if self-hosting is non-negotiable today.
How much does an aggregated multi-provider call cost?
$0.25 for the multi-provider fanout (2-4 LLMs in parallel), $0.75 for the full fan-out with `web_search: true`, $0.02 for cache hits. One billable line per call, regardless of provider count. A typical 50-brand portfolio polled hourly via `mode: quick` runs under $50/month thanks to ~70-80% cache hit rates on repeat agency prompts.
Can I bring my own provider keys?
Not at the moment. MentionsAPI bundles the OpenAI, Anthropic, Google, and Perplexity keys behind your bearer token. BYOK is on the V2 Enterprise roadmap. If you need to use your own provider contracts (negotiated pricing, dedicated capacity, regional routing), email [email protected] and we'll prioritize the hand-off.
How do I tell which providers cited which URLs?
Every top-level `citations[]` entry has a `providers_cited[]` array listing the providers that referenced that canonical URL. You can also iterate `data.results[]` and read each provider's per-result `citations[]` for the un-merged view. Two views, one canonical URL store, no duplicate parsing on your side.
What happens when a provider rolls out a breaking change?
Our adapter layer absorbs it. We treat upstream provider changes as our problem, not yours: when OpenAI added `tool_calls` to chat completions, our parser caught it within hours and your code didn't need to change. If a provider's response shape changes mid-call (rare but real), the slot returns an `errors[]` entry and your normalized schema stays intact for the providers that did parse cleanly.
Code example

Aggregate citations across providers

Drop in your API key and you're live. Same response shape across every provider.

 ask.mjs
const res = await fetch("https://api.mentionsapi.com/v1/ask", {
  method: "POST",
  headers: {
    Authorization: "Bearer lvk_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    providers: ["openai", "anthropic", "gemini", "perplexity"],
    prompt: "What are the leading observability platforms?",
    track_brands: ["Datadog", "New Relic", "Honeycomb"],
  }),
});
const data = await res.json();
// data.citations[] is normalized across providers. Each entry has
// canonical_url, domains[], providers_cited[], title
Compare

MentionsAPI vs. existing aggregators

 The other wayMentionsAPI
OpenRouterRouting layer, no extractionRouting + brand/citation extraction
LiteLLM (self-hosted)You run + maintain itHosted, cached, 24/7
Custom proxyYou write normalizationNormalization built-in
Direct provider APIs4 invoices, 4 rate limits1 invoice, 1 rate limit
Pricing

Top up from $10. Pay per call. No plans.

Aggregator pricing scales with volume. Pay $0.02 per /v1/check?mode=quick call (4 LLMs in parallel) and $0.25 for perplexity_live UI scrapes. A portfolio of 50 brands polled hourly via mode:quick costs under $50/month with shared cache. $1 free signup credit, $5 minimum top-up, no monthly tiers.

Stop wiring up four SDKs.

One API key, four answer engines, structured responses. $10 minimum top-up. Credits never expire.