AI Engines Agree on Brands More Than Sources

A June 2026 study of 1,373 AI answers found that five AI engines usually agree on whether a brand belongs in the answer, yet their cited source domains diverge sharply. Gemini and Google AI Overview are the exception, with 98.4% brand-mention agreement and 0.643 citation-domain overlap.

Last updated 2026-06-23Data window 2026-06-16 to 2026-06-20 UTC (1,373 AI answers; 533 latest query-by-engine answers; 62 queries answered by all five engines)Download CSV

Methodology

We pulled anonymized aggregate Foglift monitoring data from 2026-06-16 through 2026-06-20 UTC. The analysis keeps the latest answer per workspace, query, and engine, then measures the 62 workspace-query groups where all five engines produced an answer. Brand-mention agreement is the share of engine pairs that made the same yes/no call on whether the brand belonged in the answer. Citation-domain overlap is Jaccard similarity over normalized cited root domains, in plain terms, how much two engines drew from the same source sites. No workspace, customer, query, answer text, or individual source URL is published.

The finding

If you optimize for AI visibility, one engine mentioning your brand is not enough. The engines may agree that your brand belongs in an answer, but they often justify that answer with different source sites. That means source coverage has to be earned and measured engine by engine.

The cleanest answer is: both layers matter. The engines often converge on the brand decision, while the cited source layer stays fragmented. In the 62 queries where ChatGPT, Claude, Gemini, Google AI Overview, and Perplexity all produced an answer, most engine pairs agreed on the yes/no brand-mention decision more than 90% of the time.

Their source sets told a different story. Gemini and Google AI Overview shared a large source layer, with 0.643 average citation-domain Jaccard overlap. Every other pair was much lower. Gemini and Perplexity reached only 0.197. Google AI Overview and Perplexity reached 0.188. ChatGPT and Claude reached 0.027.

AI answers analyzed

1,373

Five-engine queries

Closest source pair

0.643

Lowest source pair

0.027

Engine-level source density

The five engines had similar brand-mention rates in the measurement window, ranging from 52.7% to 56.5%. The larger difference was how much source material each answer exposed. Gemini, Google AI Overview, and Perplexity averaged roughly 10 to 11 citations per answer. ChatGPT averaged 3.28. Claude averaged 1.34 and exposed citations in 41.2% of answers.

Engine	Answers	Mention rate	Avg citations	Citation coverage
ChatGPT	246	56.5%	3.28	96.3%
Claude	245	54.7%	1.34	41.2%
Gemini	244	56.1%	11.15	100.0%
Google AI Overview	406	52.7%	10.43	99.5%
Perplexity	232	53.9%	11.32	100.0%

Pairwise agreement vs. source overlap

The strongest split appears when the answer layer and source layer are measured side by side. Gemini and Google AI Overview look like siblings: high brand-mention agreement and high source overlap. Perplexity agrees with the Google engines on the brand decision but cites a different source pool. ChatGPT and Claude converge less often on the brand decision and share almost no cited domains.

Brand-mention agreement tells you whether two engines made the same call on inclusion. Citation-domain overlap tells you whether they cited the same sites. A high agreement score with a low overlap score is the important pattern: the engines can choose the same brand for different evidentiary reasons.

Engine pair	Brand-mention agreement	Avg citation-domain Jaccard
Gemini ↔ Google AI Overview	98.4%	0.643
Gemini ↔ Perplexity	96.8%	0.197
Google AI Overview ↔ Perplexity	95.2%	0.188
ChatGPT ↔ Gemini	93.5%	0.084
Claude ↔ Gemini	91.9%	0.062
Claude ↔ Perplexity	91.9%	0.042
ChatGPT ↔ Google AI Overview	91.9%	0.082
ChatGPT ↔ Perplexity	90.3%	0.054
Claude ↔ Google AI Overview	90.3%	0.062
ChatGPT ↔ Claude	85.5%	0.027

Interpretation

This points to a two-step model of AI search visibility. First, the engine builds a candidate set from its retrieved or internally available evidence. Second, it synthesizes a brand answer from that evidence. The candidate-source layer varies sharply by engine. The synthesis layer can still land on the same brand decision.

The Google pair is the clearest evidence for a shared retrieval substrate. Gemini and Google AI Overview agree on the brand decision in 98.4% of complete query groups and share far more cited domains than any other pair. That does not mean the answers are identical. It means the source universe feeding the answers is visibly related.

Perplexity is the opposite pattern. It often agrees with Gemini and Google AI Overview on whether the brand belongs in the answer, yet its citation overlap with them is much lower. That is a retrieval difference with answer-level convergence.

What to do with it

A blended AI Visibility score is useful for the board-level trend. It is too blunt for source acquisition. Source strategy has to be engine-specific.

For Gemini and Google AI Overview, treat improvements as partially shared. Content that becomes a strong source for one has a realistic chance of helping the other.
For Perplexity, audit the citation panel directly. The brand answer can match the Google engines while the source set comes from a different publisher universe.
For ChatGPT and Claude, track both mention status and cited-source density. A brand can be named with few surfaced sources, which makes source attribution a separate optimization problem.

The practical takeaway is simple: optimize the answer and the source layer separately. The answer layer tells you whether the brand is being selected. The source layer tells you which publisher ecosystem is doing the work.

Limits

This is production monitoring data, not a controlled query benchmark. Queries span multiple workspaces and industries, and the measurement window covers four days. We publish only anonymized aggregates. The sample is strong enough to show the source-vs-answer split, but it should be refreshed quarterly with a controlled query set before being used as a category-wide law.

Citation-domain Jaccard measures surfaced citations only. It does not claim to observe every page an engine retrieved internally. That makes the metric conservative: it measures the source layer available to users and downstream citation analysis.

Reproducibility

The aggregation script is saved under state/research/engine-source-divergence-2026-aggregation.mjs. It reads production monitoring answers, keeps only anonymized aggregate metrics, and emits the same tables used here. The downloadable CSV contains the measurement window, engine summary, and pairwise overlap table.

Earlier controlled-query reference benchmark: AI Search Citation Benchmark, Q2 2026
Companion source-overlap analysis: ChatGPT vs. Google AI Overview
Engine content-type breakdown: Five AI Engines, Five Content Diets

To measure this on your own brand, use the AI search monitoring workflow and compare brand mentions, citations, and competitors separately for ChatGPT, Perplexity, Gemini, Claude, and Google AI Overview.

Frequently Asked Questions

Do AI engines surface different brands because they index different sources?

Partly. In this June 2026 sample, brand-mention decisions were fairly aligned across engines, but cited-domain overlap was much lower. That means source retrieval differs strongly, while the final brand decision can still converge.

Which engines behaved most similarly?

Gemini and Google AI Overview were the closest pair, with 98.4% brand-mention agreement and 0.643 average citation-domain Jaccard overlap across 62 queries answered by all five engines.

Which engines were most different?

ChatGPT and Claude were the weakest pair by both answer agreement and source overlap in this sample: 85.5% brand-mention agreement and 0.027 average citation-domain Jaccard overlap.

What should marketers do with this finding?

Track engines separately. A page or source that helps in Gemini and Google AI Overview may have little effect in ChatGPT, Claude, or Perplexity, even when all five engines answer the same buyer query with similar brand decisions.

What is citation-domain overlap?

Citation-domain overlap measures how often two engines cite the same root domains. A higher score means the engines are drawing from a similar source pool; a lower score means they can reach similar brand decisions while citing different sites.