For developers building AI-visibility tools

The best ChatGPT API wrapper with built-in brand tracking

A ChatGPT API wrapper is a layer that sits in front of OpenAI's chat-completions endpoint and adds capabilities the raw SDK doesn't ship. MentionsAPI wraps GPT-5 with brand-mention extraction, citation parsing, response caching, and a schema that matches Claude, Gemini, and Perplexity, so the same client call returns structured data instead of prose.

If you're calling the OpenAI API directly to monitor how your brand surfaces in ChatGPT answers, you've already discovered the problem: raw chat completions are unstructured. You get a wall of prose, you regex for your brand name, you parse out source links, and then you do it all again the next time the model phrases things differently.

MentionsAPI is a thin wrapper over ChatGPT (and three other answer engines) that solves the structured-output problem at the edge. You pass a `track_brands` array, we run the prompt through GPT-5, and you get back a clean object: every mention with its position in the answer, sentiment, and the cited URL.

It's the API you'd build yourself after the third time someone on your team broke the regex. We just shipped it first.

Top up from $10 · Pay per call · Credits never expire

Pay as you go·$10 minimum · Credits never expire · No plans

Why a wrapper instead of the OpenAI SDK?

The OpenAI SDK is excellent for building chat products. It is not built for brand monitoring. You get unstructured text, no citation parsing, and no caching layer. Every brand-tracking workflow ends up duplicating the same three or four utility files. Worse, when your boss asks 'what does Claude say?', you start the whole project over from scratch.

MentionsAPI flips the model: a single endpoint, a single response schema, four LLMs swappable via a `providers` array. Add Gemini to your dashboard with a one-line code change.

There's also an auth flattening that's easy to underestimate. You authenticate once with your MentionsAPI bearer token; we hold the OpenAI key, the Anthropic key, the Gemini key, and the Perplexity key on the server side and rotate them quietly. Your `.env` doesn't grow every time you add a provider, and your security review surface stays exactly one credential wide.

What you get out of the box

Pass a prompt and a list of brands you care about. The response includes the raw answer text, an array of mentions (each with position, sentiment, and surrounding context), and a list of cited URLs with their domains resolved. We handle the LLM rate limits, retries, and caching. Your code becomes a single fetch call.

Because we share a 24-hour cache across customers running similar prompts, your costs drop by an order of magnitude versus calling OpenAI directly for repeat queries, and you can override with `cache_bypass: true` whenever you need a fresh result.

Brand matching uses a Levenshtein-≤2 fuzzy comparator on top of case-insensitive normalization, so 'OpenAI' and 'Open AI' (or 'GitHub' and 'Github') don't quietly mismatch. We also track sentence-level position, not just first-mention offset, so a brand that gets named in the closing sentence after a competitor lead doesn't get a misleading 'first mention' score in your dashboard.

Built for GEO and brand-monitoring tools

We built MentionsAPI for developers who want one HTTP call to do something their toolkit makes painful: ask every major LLM about their brand and get back a structured answer. If you're building a Generative Engine Optimization product, an SEO agency dashboard, or an internal brand-watch tool, this saves you weeks of glue code.

It's also a clean drop-in for teams already running OpenAI in production. The request body accepts the prompt and (optionally) the model overrides; the response keeps the original `text` so any existing rendering logic still works, and the `mentions[]` and `citations[]` arrays are additive. Most teams adopt it by changing the URL and the auth header. Total integration cost is the time it takes to update an environment variable.

How the ChatGPT wrapper works under the hood

Your request hits our edge, gets canonicalized (prompt + provider set + tracked brands + model override), and we hash it for cache lookup. A cache hit returns the stored normalized response in roughly 80-200 ms total round-trip and bills at $0.02. A miss invokes our OpenAI adapter, which sends the chat-completions request through GPT-5 (or your overridden model), unwraps `choices[0].message.content` into `text`, and pulls any tool-call source URLs into a per-provider `citations[]`.

Brand extraction runs on the returned text in a deterministic pass: we tokenize on sentence boundaries, run case-insensitive plus alias matching for each `track_brands` entry, compute the character offset and sentence index, and score the surrounding sentence for sentiment using a calibrated classifier, not a re-prompt to another LLM. That means the extraction step is cheap (sub-50 ms typical) and reproducible: the same input text always returns the same mentions array.

We measured uncached single-provider OpenAI latency at p50 ~1.6 s, p95 ~3.4 s, p99 ~5.8 s over the last 30 days of production traffic. Close to a direct OpenAI SDK call once the cache lookup overhead is amortized. Cache hits on the same prompts run p50 ~140 ms because they skip OpenAI entirely. Repeat queries against the same prompt list cluster around 70-80% cache hit rate in production agency workloads, which is where the cost story actually wins.

Per the methodology page, brand extraction confidence is reported with a Wilson 95% confidence interval over the rolling sample. If a customer asks 'why didn't ChatGPT mention us in this answer?', you can hand them a deterministic, replayable record, not a maybe-it-was-a-bad-day shrug.

When to use this wrapper (and when to skip it)

Use this wrapper when you need structured outputs from ChatGPT for monitoring, comparison, or analytics. Anything where the raw `text` is the start of your pipeline, not the end. GEO tools, brand-mention dashboards, content-attribution loops, and AI-search benchmarks are all the right shape: you want the same prompt run repeatedly, normalized, and stored for re-extraction with new brand lists later.

Use it specifically over the OpenAI SDK when you're going to add Claude, Gemini, or Perplexity later. Most teams that start single-provider end up multi-provider within a quarter. Your boss sees the dashboard and asks 'what does Claude say?', and the rewrite cost is real. Adopting the wrapper now makes that future change a one-line array edit.

Skip it for chat products. If you're building a customer-facing assistant, a copy generator, or anything where you want streaming, function calling, or vision inputs, the OpenAI SDK exposes knobs we deliberately don't surface. Multi-provider normalization is overhead you don't need when only one model's answer matters and you want the latest provider-specific feature the day it ships. Use the SDK directly there; come back to MentionsAPI when the question shifts from 'generate good output' to 'measure where our brand shows up'.

FAQ

Frequently asked questions

Answer-first, dev-to-dev. Each one is also embedded as FAQPage schema for AI engines.

What is a ChatGPT API wrapper?
A ChatGPT API wrapper is a layer that sits in front of OpenAI's chat-completions endpoint and adds capabilities the raw SDK doesn't ship. MentionsAPI adds brand-mention extraction, citation parsing, response caching, and a unified schema that matches Claude, Gemini, and Perplexity. You call one endpoint and get structured data instead of raw prose.
Why use a wrapper instead of the OpenAI SDK directly?
The OpenAI SDK is built for chat products, not brand monitoring. You get unstructured prose, no citation parsing, and no caching layer. Every brand-tracking codebase ends up duplicating the same regex utilities. MentionsAPI returns a structured `mentions[]` array (with position, sentiment, context) and shares a 24-hour cache, dropping repeat-query costs by an order of magnitude.
Can I track brand mentions in ChatGPT API answers?
Yes. That's the core feature. Pass `"providers": ["openai"]` and a `track_brands` array to `/v1/check`. The response includes the raw answer text plus a `mentions[]` array where each entry has `{brand, position, sentiment, surrounding_text}`. Add Claude, Gemini, or Perplexity to the providers array and the same extraction runs against them too.
Do I need an OpenAI API key?
No. MentionsAPI bundles the OpenAI key behind your bearer token. Auth is `Authorization: Bearer lvk_live_..` and we handle the upstream OpenAI rate limits, retries, and errors. Bring-your-own-keys (BYOK) is on the V2 roadmap for Enterprise. Email [email protected] if you need it now.
How much does a ChatGPT call cost via MentionsAPI?
$0.05 for a single-provider call with no web_search, $0.15 with web_search enabled, $0.02 if it hits the shared 24-hour cache. Compared to direct OpenAI billing (token-based, $0.10-$0.30 for a moderate prompt with output), MentionsAPI is cheaper for repeat-query monitoring workflows because of the cache. For one-off generative work, OpenAI direct is cheaper.
Does the API miss what real ChatGPT users see?
The OpenAI API and the chatgpt.com UI run different retrieval pipelines on top of the same model. We measured the gap on 1,000 prompts: 96% of ChatGPT API answers had at least one meaningful drift versus the live UI on brand mentions, citations, or ranking. If you need parity with what your end-customer sees, use `mode: perplexity_live` ($0.25) or pair the API call with our delta-report tooling.
Can I bypass the cache for fresh results?
Yes. Pass `"cache_bypass": true` in the request body and the call goes straight to OpenAI, returning a fresh result and billing at the fresh-call rate ($0.05-$0.15). Useful when you want to verify a model behavior change or run a real-time check on a prompt that's already in cache.
Does MentionsAPI work with GPT-5?
Yes. Default OpenAI model is GPT-5; override per call by passing `model: { openai: "gpt-4o" }` or another supported version. The response schema is identical regardless of model version, so swapping models doesn't break your downstream code. Specific version pinning is useful for regression testing or locking in a model for compliance reasons.
Code example

Track your brand in a ChatGPT answer

Drop in your API key and you're live. Same response shape across every provider.

 POST /v1/ask
curl https://api.mentionsapi.com/v1/ask \
  -H "Authorization: Bearer lvk_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "providers": ["openai"],
    "prompt": "What are the best project management tools in 2026?",
    "track_brands": ["Notion", "Linear", "Asana"]
  }'
Compare

MentionsAPI vs. calling OpenAI directly

 The other wayMentionsAPI
OpenAI SDKUnstructured prose; you parseStructured mentions array
OpenAI SDKNo caching; every call hits OpenAIShared 24h cache, tunable per-call
OpenAI SDKChatGPT only. Re-wire for ClaudeSwap providers with one array
OpenAI SDKToken billing; surprisesBundled per-call pricing
Pricing

Top up from $10. Pay per call. No plans.

Pay-as-you-go. /v1/check?mode=quick costs $0.02 per call (4 LLM APIs in parallel). /v1/check?mode=perplexity_live is $0.25 (UI scrape with full citations + fan_out). No monthly tiers, no commitment. $1 free signup credit, $5 minimum top-up, credits never expire.

Stop wiring up four SDKs.

One API key, four answer engines, structured responses. $10 minimum top-up. Credits never expire.