Normalized citations across providers
Each LLM cites sources differently. Perplexity uses inline footnote markers, ChatGPT (with browsing) uses a separate sources array, and Claude with web search uses XML tags. We normalize all of them into a single `citations` array per provider, with `url`, `domain`, `title`, and `snippet`.
We also resolve redirects and strip tracking parameters, so the URLs you get are canonical and dedupe-friendly. If three providers all cite the same article via three different short links, you see one entry per provider and a clean source URL.
Deduplication happens at two levels. Per-provider, a citation that the model lists twice (common in Perplexity's longer answers) collapses into one entry with a `count` field. Cross-provider, the top-level `citations[]` array merges identical canonical URLs and exposes a `providers_cited[]` so you can answer 'which engines all cited this article?' without joining four arrays yourself.
Brand mention tagging
Pass an array of brand names in `track_brands`. Each provider's response is scanned for those mentions, with case-insensitive matching, common alias handling (e.g., 'OpenAI' / 'Open AI'), and position metadata so you can render highlights in your UI without re-parsing.
Sentiment scoring (`positive`, `neutral`, `negative`) runs on the surrounding sentence, not the full answer, so brand mentions in different sections get accurate per-mention sentiment.
Fuzzy matching uses a Levenshtein-≤2 distance threshold over the normalized brand string, which catches the realistic typos and casing variations LLMs introduce ('Cloudfare' for 'Cloudflare', 'Postgres' for 'PostgreSQL') without hallucinating matches on unrelated tokens. The match scope is reported in the response so your UI can show the user exactly which substring matched, not just a yes/no boolean.
One bill, one rate limit
Tracking your spend across four providers is a part-time accounting job. With MentionsAPI, every call deducts from one wallet balance. One line on your Stripe receipts, one rate limit, and you stop being surprised by Anthropic's quarterly token re-pricing.
The wallet model is also why the aggregator is cheaper than DIY for most repeat workloads. Direct provider billing is per-token and varies by model tier; MentionsAPI bills $0.02 on cache hits and $0.25 on full multi-provider fan-outs, regardless of underlying token usage. For agency tools running the same prompt list against 50-200 client brands, cache hit rates of 70-80% on second-and-later runs are typical. That's the gap between a $40 monthly bill and a $400 one.
How aggregation and deduplication work under the hood
An incoming `/v1/check` request hits a normalization pipeline with three stages: dispatch, parse, and merge. Dispatch fans out to the chosen providers in parallel via per-provider adapters that handle authentication, request shape, and upstream rate limit signals. Parse runs four format-specific extractors that pull each provider's citations into the common `{url, domain, title, snippet}` shape; Perplexity's inline `[1]` markers get rewritten out of `text`, ChatGPT's tool-call sources get unnested, Claude's `<citation>` XML gets parsed, and Gemini's grounding metadata gets unrolled.
Merge runs two passes: a URL canonicalizer that resolves common shorteners (`t.co`, `news.google.com`, AMP redirects) and strips tracker parameters (`utm_*`, `fbclid`, `gclid`, `mc_eid`), then a dedupe pass that groups by canonical URL. The output is the per-provider `results[i].citations[]` plus a top-level `citations[]` with `providers_cited[]`, so you can render either shape without re-joining.
Latency profile: the merge step adds roughly 30-80 ms over the raw fan-out latency, dominated by the redirect resolver (which we cache aggressively). For cached requests this overhead is moot. The entire normalized response is stored verbatim and returns in ~140 ms. For uncached `mode: quick` requests, total median latency lands at 3-4 s; the slowest provider is the long pole, not our parsing.
Edge cases worth knowing: when two providers cite different canonical URLs that redirect to the same final destination, we merge them on the resolved canonical, not the source URL. When one provider cites a URL that 4xxs at resolution time, we keep the original URL with a `resolved: false` flag rather than dropping it. Partial information beats silent loss.
When to use an aggregator (and when to build your own)
Use the aggregator when you need cross-provider normalized data and you don't want to maintain four citation parsers. The clearest fit is GEO tools, multi-LLM brand monitoring, citation-attribution dashboards, and any internal tool where the same prompt needs to run across providers and the results need to land in one schema. Time-to-value is hours, not engineer-months.
Use it specifically over LiteLLM or OpenRouter when you need extraction layered into normalization. Those tools are routing/proxy layers and will happily give you raw provider responses, but you'll still write the citation parsers and brand extractors yourself. If your project's hard part is parsing, this is the aggregator. If it's only routing one call to one model, OpenRouter is leaner.
Don't use it if you want a self-hosted, in-VPC aggregator with full control over the upstream key store. LiteLLM is open source and can sit inside your network; we run it as a managed service. Enterprise customers with strict data-residency requirements should email [email protected]. We have a deployment story but it's not the headline product.