Question 1

What is a multi-LLM API?

Accepted Answer

A multi-LLM API is a single endpoint that calls multiple LLM providers in parallel and returns their responses in one normalized response shape. MentionsAPI's `/v1/check` fans out to ChatGPT, Claude, Gemini, and Perplexity simultaneously, returning a `results[]` array where every entry has the same fields. You go from four SDKs and four auth flows to one bearer token.

Question 2

Can I query ChatGPT, Claude, Gemini, and Perplexity in one call?

Accepted Answer

Yes. That's the core endpoint. Pass `"providers": ["openai", "anthropic", "gemini", "perplexity"]` in the request body and all four run in parallel. Total latency is the slowest single provider, not the sum. With a cache hit, median total response time is ~1.2 seconds across all four providers.

Question 3

How fast is a multi-provider call?

Accepted Answer

Median 1.2 seconds for cached prompts across all four providers. Uncached `mode: quick` calls land around 3-4 s median; full fan-outs with `web_search: true` are 6-8 s median because Perplexity's grounded path is the long pole. Providers run in parallel with smart timeouts, so a hung Gemini call won't stall the response. It shows up as a timeout entry in `errors[]`.

Question 4

How much does a multi-provider call cost?

Accepted Answer

$0.25 for the multi-provider fanout (2-4 LLMs in parallel, no web search). $0.75 for the full fanout (4 LLMs + web_search). $0.02 if it hits the shared 24-hour cache. One billable line per call, regardless of how many providers were in the fan-out. Pay-as-you-go, $10 minimum top-up, credits never expire.

Question 5

How is MentionsAPI different from OpenRouter or LiteLLM?

Accepted Answer

OpenRouter is a routing layer. It forwards requests to one LLM at a time. LiteLLM is self-hosted and you maintain it. MentionsAPI runs all four providers in parallel by default and adds brand mention extraction, citation canonicalization, and a 24-hour shared cache on top. It's an aggregator built specifically for comparison and monitoring workflows, not just routing.

Question 6

Can I pin specific model versions across providers?

Accepted Answer

Yes. Pass `model: { openai: "gpt-5", anthropic: "claude-sonnet-4-5", perplexity: "sonar-pro" }` to override the defaults. The response shape stays identical, so your downstream code doesn't change. Useful when you want a cheaper tier for bulk crawls or need to lock in a specific version for regression testing.

Question 7

How do I handle partial failures in a multi-provider call?

Accepted Answer

The response includes both successful results and an `errors[]` array. You never get a 500 because one provider hiccupped. Iterate `data.results` for the successes and check `data.errors` for which providers failed and why. You only pay for the providers that actually returned data; full failures cost nothing.

Question 8

How do I add a fifth provider later?

Accepted Answer

When we ship a new provider adapter (Mistral and Cohere are on the roadmap), you add it to your `providers` array and ship. The unified schema stays the same, your normalization code keeps working, and the new provider's quirks live inside our adapter. Zero migration work on your side. That's the whole reason to use an aggregator instead of writing your own.

	The other way	MentionsAPI
DIY (4 SDKs)	4 auth flows, 4 retry handlers	1 bearer token, automatic retries
DIY (4 SDKs)	4 different response shapes	Single normalized schema
DIY (4 SDKs)	Sequential = slow	Parallel by default
DIY (4 SDKs)	No shared cache	Cross-customer 24h cache

One multi-LLM API for ChatGPT, Claude, Gemini, and Perplexity

The unified response schema

Parallel execution by default

Built for comparison workflows

How the multi-LLM fan-out works under the hood

When to use a multi-LLM API (and when not to)

Frequently asked questions

Query all four providers in one call

MentionsAPI vs. building your own multi-LLM router

Top up from $10. Pay per call. No plans.

Stop wiring up four SDKs.