For developers building AI-visibility tools

AI citation tracking API for ChatGPT, Claude, Gemini, and Perplexity

An AI citation tracking API is a programmatic interface for monitoring which URLs language-model answer engines cite as sources for which prompts. MentionsAPI extracts citations from ChatGPT, Claude, Gemini, and Perplexity, resolves redirects to canonical URLs, deduplicates across providers, and tags each citation with the prompt that surfaced it, so you can build the inbound graph that AI search now decides.

When an AI answer engine cites your URL, that's the closest thing to a backlink the new search era provides. The catch: citations are buried in different formats per provider, often behind redirect links, and rarely deduplicated.

MentionsAPI extracts every citation from every provider, resolves redirects to canonical URLs, deduplicates across providers, and tags each one with the prompt that surfaced it. You get a clean inbound graph: which prompts drive citations to which of your pages, by which provider.

Wire it into your analytics stack and you can finally answer 'which of our content is winning in AI answers?'

Top up from $10 · Pay per call · Credits never expire

Pay as you go·$10 minimum · Credits never expire · No plans

From citation to actionable URL

Perplexity gives you a `[1]` style footnote that maps to a citations array. ChatGPT browsing returns sources nested in a tool-call response. Gemini sometimes embeds raw URLs in markdown; Claude (with web search enabled) returns XML-tagged citations. None of these are interchangeable. We normalize all four into a single array of `{url, domain, title, snippet}` objects.

Redirects from `t.co`, `news.google.com`, and tracker-laden URLs get resolved server-side. The URL you see is the URL the buyer would land on, ready to feed your analytics.

We also strip the standard tracker parameters (`utm_*`, `fbclid`, `gclid`, `mc_eid`, `_hsenc`) before exposing canonical URLs, so the same article cited via a Twitter share, a newsletter blast, and a direct link all collapse into one row in your reporting. The cleanup happens in our redirect resolver and the original `source_url` is preserved in the response if you need it for audit.

Track citations to your domains

Every response includes a top-level `citations[]` array with canonicalized URLs and the `providers_cited` that referenced each one. Filter by your domain on the client side. No regex, no parsing of per-provider formats.

Cross-reference with your prompt library to build a 'pages cited per prompt' table. That's the table that justifies your content investments to the rest of the company.

Citation patterns also expose the provider mix. ChatGPT and Claude lean heavily on a few high-authority domains per category; Perplexity surfaces a wider tail; Gemini's grounding behavior swings the most call-to-call. The same canonical URL filter answers different strategic questions. 'are we cited at all?' vs 'are we one of the few?' vs 'are we just in the long tail?'

Retrieve and diff responses

Every `/v1/ask` response is archived for 30 days. Pull it back with `GET /v1/ask/:id` and diff against yesterday's snapshot to detect dropped or newly-appearing citations. Ship the diff to your own dashboard or Slack channel; the primitive is there.

For scheduled tracking, `/v1/monitors` runs the diff for you and ships a `citations_diff` object in the webhook payload (`added[]`, `removed[]`, `provider_changes[]`). That's the alert primitive most teams want. 'tell me when ChatGPT stops citing our pricing page' is a one-line filter on `removed[]`, no client-side diffing needed.

How citation extraction and canonicalization work under the hood

Each provider adapter has a format-specific extractor: Perplexity's inline `[1]`-style markers get matched against the response's `citations` array and rewritten out of `text` so the canonical text doesn't have visual noise. ChatGPT browsing's `tool_calls` are unwrapped to pull `function_call.arguments` URLs. Claude's `<citation>` XML blocks are parsed into `{url, title}` pairs. Gemini's `groundingMetadata.groundingChunks` are unrolled. The four extractors land in the same shape: `{url, domain, title, snippet, provider}`.

Canonicalization runs in two phases. First, redirect resolution: we follow up to 5 hops, resolve common shorteners (`t.co`, `bit.ly`, `news.google.com`), unwrap AMP cache URLs to their canonical, and stop at the first 200/3xx-terminal hit. Second, parameter cleanup: we strip the standard tracker set, normalize fragment identifiers, and collapse trailing slashes. The dedupe key is the cleaned canonical URL.

Latency profile: for cached calls the resolved citations come back instantly (~140 ms total response). For uncached calls, the redirect resolver runs in parallel with the LLM dispatch where possible. By the time the slowest provider returns, most citations are already resolved. Worst case (a Perplexity answer with 10+ shortened URLs requiring redirect chains) adds ~200-500 ms over the raw LLM latency.

Edge cases worth knowing: when a redirect chain dead-ends in a 4xx, we keep the original cited URL and mark `resolved: false` rather than dropping the citation. Partial information beats silent loss. When two providers cite different URLs that both resolve to the same canonical (a common newspaper paywall vs. AMP scenario), we merge them on the resolved canonical and expose `source_urls[]` so audit trails stay intact.

When to use citation tracking (and when to skip)

Use it when you're building content-attribution reporting for an SEO/GEO team, monitoring inbound citations to a documentation or marketing site, or shipping an alerting tool that fires on 'citation dropped' events. The combination of canonicalization, dedup, retention, and webhook diffs is the pattern most analytics workflows want, and the parser maintenance cost (each provider changes formats every quarter or two) is genuinely non-trivial to absorb in-house.

Use it specifically over a DIY parser when you need cross-provider data. A single-provider citation parser is a weekend project; four parsers plus a canonicalizer plus a redirect resolver plus a tracker stripper plus a dedupe layer plus a 30-day store is several weeks of engineer time. Doing it once and shipping is a different discussion than maintaining it.

Skip it if you only need to track citations from one provider and don't expect to add others. A direct Perplexity API call plus a 50-line citation parser handles single-provider monitoring cheaply. Skip it also if your organization's analytics live entirely in Google Analytics or Mixpanel. Those tools won't see AI citations as referrers, so the data layer doesn't have a downstream consumer in your stack. In that case, ship the citation report as a static weekly PDF and revisit when you need real-time alerting.

FAQ

Frequently asked questions

Answer-first, dev-to-dev. Each one is also embedded as FAQPage schema for AI engines.

What is AI citation tracking?
AI citation tracking is monitoring which URLs answer engines (ChatGPT, Claude, Gemini, Perplexity) link to as sources for a given prompt. It's the closest thing to a 'backlink' the AI search era provides. MentionsAPI extracts every citation from every provider, resolves redirects to canonical URLs, deduplicates across providers, and tags each one with the prompt that surfaced it.
How does MentionsAPI handle different citation formats?
Perplexity uses inline `[1]` footnotes mapped to a sources array. ChatGPT browsing returns sources nested in tool-call responses. Gemini embeds raw URLs in markdown via `groundingMetadata`. Claude with web search uses XML tags. We normalize all four into a single `citations[]` array of `{url, domain, title, snippet}` objects per response. One parser, four providers.
Are citation URLs canonicalized?
Yes. We resolve redirects from `t.co`, `news.google.com`, AMP caches, and other shorteners server-side, strip tracking parameters (`utm_*`, `fbclid`, `gclid`), and dedupe across providers. The URL you get is the URL the buyer would actually land on. Ready to feed into your analytics or content-attribution pipeline without further cleaning.
How do I track citations to my own domain?
Filter the top-level `citations[]` array client-side on your domain. Each entry includes `providers_cited[]` so you know which engines referenced the URL. Cross-reference with your prompt library to build a 'pages cited per prompt' table. That's the table that justifies content investments to the rest of the company.
How is this different from writing my own citation parser?
DIY means writing four format parsers, a redirect resolver, a tracker-param stripper, a dedupe layer, and a history store, and maintaining them as each provider changes formats. MentionsAPI runs all of that and returns a clean `citations[]` array. We've already lost the Sundays so you don't have to.
Can I detect when a citation is added or dropped?
Yes. Every `/v1/check` response is archived for 30 days. `GET /v1/ask/:id` returns it later. Diff today's `citations[]` against yesterday's to detect newly-appearing or dropped sources. Or schedule the prompt via `/v1/monitors` and the webhook payload includes a `citations_diff` object with `added[]`, `removed[]`, and `provider_changes[]` already computed.
How much does citation tracking cost?
Citation extraction is bundled into every `/v1/check` call. Never a separate line item. So the same $0.02 cache hit / $0.05-$0.75 per fresh call applies. Webhooks via `/v1/monitors` are included. PAYG, $10 minimum top-up, no per-citation fees, no monthly plan.
What happens if a redirect chain breaks?
If a citation URL's redirect chain dead-ends in a 4xx, we keep the original URL with a `resolved: false` flag rather than dropping it. Partial information beats silent loss. The original `source_url` is also preserved in the response if you need it for audit, and the resolver retries the next time the same URL is cited so transient failures self-heal.
Code example

Get canonicalized citations across providers

Drop in your API key and you're live. Same response shape across every provider.

 POST /v1/ask
curl https://api.mentionsapi.com/v1/ask \
  -H "Authorization: Bearer lvk_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "providers": ["perplexity", "openai"],
    "prompt": "How do I deploy a Next.js app to Cloudflare Pages?",
    "web_search": true
  }'
Compare

MentionsAPI vs. building citation parsing yourself

 The other wayMentionsAPI
DIY parserPer-provider citation formatsSingle normalized schema
DIY parserRedirect chains everywhereResolved canonical URLs
DIY parserNo history layerPer-prompt citation history
DIY parserManual change detectionWebhook on citation changes
Pricing

Top up from $10. Pay per call. No plans.

Citation tracking is bundled into every call. Never a separate line item. Webhooks via /v1/watch are included. Pay-as-you-go from $0.02/call (mode:quick) or $0.25/call (mode:perplexity_live). $1 free signup credit, $5 minimum top-up. No monthly tiers, no commitment.

Stop wiring up four SDKs.

One API key, four answer engines, structured responses. $10 minimum top-up. Credits never expire.