AI Engines Agree on Brands More Than Sources
A June 2026 study of 1,373 AI answers found that five AI engines usually agree on whether a brand belongs in the answer, yet their cited source domains diverge sharply. Gemini and Google AI Overview are the exception, with 98.4% brand-mention agreement and 0.643 citation-domain overlap.
Methodology
We pulled anonymized aggregate Foglift monitoring data from 2026-06-16 through 2026-06-20 UTC. The analysis keeps the latest answer per workspace, query, and engine, then measures the 62 workspace-query groups where all five engines produced an answer. Brand-mention agreement is the share of engine pairs that made the same yes/no call on whether the brand belonged in the answer. Citation-domain overlap is Jaccard similarity over normalized cited root domains, in plain terms, how much two engines drew from the same source sites. No workspace, customer, query, answer text, or individual source URL is published.
The finding
If you optimize for AI visibility, one engine mentioning your brand is not enough. The engines may agree that your brand belongs in an answer, but they often justify that answer with different source sites. That means source coverage has to be earned and measured engine by engine.
The cleanest answer is: both layers matter. The engines often converge on the brand decision, while the cited source layer stays fragmented. In the 62 queries where ChatGPT, Claude, Gemini, Google AI Overview, and Perplexity all produced an answer, most engine pairs agreed on the yes/no brand-mention decision more than 90% of the time.
Their source sets told a different story. Gemini and Google AI Overview shared a large source layer, with 0.643 average citation-domain Jaccard overlap. Every other pair was much lower. Gemini and Perplexity reached only 0.197. Google AI Overview and Perplexity reached 0.188. ChatGPT and Claude reached 0.027.
Engine-level source density
The five engines had similar brand-mention rates in the measurement window, ranging from 52.7% to 56.5%. The larger difference was how much source material each answer exposed. Gemini, Google AI Overview, and Perplexity averaged roughly 10 to 11 citations per answer. ChatGPT averaged 3.28. Claude averaged 1.34 and exposed citations in 41.2% of answers.
| Engine | Answers | Mention rate | Avg citations | Citation coverage |
|---|---|---|---|---|
| ChatGPT | 246 | 56.5% | 3.28 | 96.3% |
| Claude | 245 | 54.7% | 1.34 | 41.2% |
| Gemini | 244 | 56.1% | 11.15 | 100.0% |
| Google AI Overview | 406 | 52.7% | 10.43 | 99.5% |
| Perplexity | 232 | 53.9% | 11.32 | 100.0% |
Pairwise agreement vs. source overlap
The strongest split appears when the answer layer and source layer are measured side by side. Gemini and Google AI Overview look like siblings: high brand-mention agreement and high source overlap. Perplexity agrees with the Google engines on the brand decision but cites a different source pool. ChatGPT and Claude converge less often on the brand decision and share almost no cited domains.
Brand-mention agreement tells you whether two engines made the same call on inclusion. Citation-domain overlap tells you whether they cited the same sites. A high agreement score with a low overlap score is the important pattern: the engines can choose the same brand for different evidentiary reasons.
| Engine pair | Brand-mention agreement | Avg citation-domain Jaccard |
|---|---|---|
| Gemini ↔ Google AI Overview | 98.4% | 0.643 |
| Gemini ↔ Perplexity | 96.8% | 0.197 |
| Google AI Overview ↔ Perplexity | 95.2% | 0.188 |
| ChatGPT ↔ Gemini | 93.5% | 0.084 |
| Claude ↔ Gemini | 91.9% | 0.062 |
| Claude ↔ Perplexity | 91.9% | 0.042 |
| ChatGPT ↔ Google AI Overview | 91.9% | 0.082 |
| ChatGPT ↔ Perplexity | 90.3% | 0.054 |
| Claude ↔ Google AI Overview | 90.3% | 0.062 |
| ChatGPT ↔ Claude | 85.5% | 0.027 |
Interpretation
This points to a two-step model of AI search visibility. First, the engine builds a candidate set from its retrieved or internally available evidence. Second, it synthesizes a brand answer from that evidence. The candidate-source layer varies sharply by engine. The synthesis layer can still land on the same brand decision.
The Google pair is the clearest evidence for a shared retrieval substrate. Gemini and Google AI Overview agree on the brand decision in 98.4% of complete query groups and share far more cited domains than any other pair. That does not mean the answers are identical. It means the source universe feeding the answers is visibly related.
Perplexity is the opposite pattern. It often agrees with Gemini and Google AI Overview on whether the brand belongs in the answer, yet its citation overlap with them is much lower. That is a retrieval difference with answer-level convergence.
What to do with it
A blended AI Visibility score is useful for the board-level trend. It is too blunt for source acquisition. Source strategy has to be engine-specific.
- For Gemini and Google AI Overview, treat improvements as partially shared. Content that becomes a strong source for one has a realistic chance of helping the other.
- For Perplexity, audit the citation panel directly. The brand answer can match the Google engines while the source set comes from a different publisher universe.
- For ChatGPT and Claude, track both mention status and cited-source density. A brand can be named with few surfaced sources, which makes source attribution a separate optimization problem.
The practical takeaway is simple: optimize the answer and the source layer separately. The answer layer tells you whether the brand is being selected. The source layer tells you which publisher ecosystem is doing the work.
Limits
This is production monitoring data, not a controlled query benchmark. Queries span multiple workspaces and industries, and the measurement window covers four days. We publish only anonymized aggregates. The sample is strong enough to show the source-vs-answer split, but it should be refreshed quarterly with a controlled query set before being used as a category-wide law.
Citation-domain Jaccard measures surfaced citations only. It does not claim to observe every page an engine retrieved internally. That makes the metric conservative: it measures the source layer available to users and downstream citation analysis.
Reproducibility
The aggregation script is saved under state/research/engine-source-divergence-2026-aggregation.mjs. It reads production monitoring answers, keeps only anonymized aggregate metrics, and emits the same tables used here. The downloadable CSV contains the measurement window, engine summary, and pairwise overlap table.
- Earlier controlled-query reference benchmark: AI Search Citation Benchmark, Q2 2026
- Companion source-overlap analysis: ChatGPT vs. Google AI Overview
- Engine content-type breakdown: Five AI Engines, Five Content Diets
To measure this on your own brand, use the AI search monitoring workflow and compare brand mentions, citations, and competitors separately for ChatGPT, Perplexity, Gemini, Claude, and Google AI Overview.
Frequently Asked Questions
Do AI engines surface different brands because they index different sources?
Partly. In this June 2026 sample, brand-mention decisions were fairly aligned across engines, but cited-domain overlap was much lower. That means source retrieval differs strongly, while the final brand decision can still converge.
Which engines behaved most similarly?
Gemini and Google AI Overview were the closest pair, with 98.4% brand-mention agreement and 0.643 average citation-domain Jaccard overlap across 62 queries answered by all five engines.
Which engines were most different?
ChatGPT and Claude were the weakest pair by both answer agreement and source overlap in this sample: 85.5% brand-mention agreement and 0.027 average citation-domain Jaccard overlap.
What should marketers do with this finding?
Track engines separately. A page or source that helps in Gemini and Google AI Overview may have little effect in ChatGPT, Claude, or Perplexity, even when all five engines answer the same buyer query with similar brand decisions.
What is citation-domain overlap?
Citation-domain overlap measures how often two engines cite the same root domains. A higher score means the engines are drawing from a similar source pool; a lower score means they can reach similar brand decisions while citing different sites.
Free tool
Run the same scan on your own site
See how your brand performs across the 5 AI engines used in these reports. No signup required.
Free Technical Audit