Introduction

As large language models increasingly mediate how people discover and consume information, understanding their search and citation behavior becomes critical for content strategy. When ChatGPT answers questions, it doesn't rely on a single search; it performs a query fanout (a series of searches), evaluates its findings, and synthesizes sources into responses.

Unlike traditional search where Search Engine Results Page (SERP) position alone determines visibility, LLM citations provided to users are determined by three key variables:

  1. Query fanout: list of search queries ChatGPT executes on behalf of a user
  2. Query position: the order in which search queries occur for a given fanout
  3. SERP position: Google SERP rank (#1-10) for a returned page.

This study examines ChatGPT's search behavior across 420 prompts and 2,867 individual queries, comparing its citations against corresponding Google search results to understand what drives visibility in LLM-generated answers.

In this report, we attempt to show why traditional "rank higher" strategies are insufficient without understanding query position dynamics within LLM search systems.

1. Executive Summary

Our findings reveal a multiplicative effect wherein query position and SERP position work together to inform source relevance and importance across retrieved results. This shift changes how we think about visibility optimization strategy; for example, ranking #1 in the SERP for ChatGPT's first query yields a 40.2% domain citation rate. That same #1 position in the last query captures only 24.3%. Unlike in SEO, SERP position alone no longer predicts visibility; query position determines whether a top SERP ranking matters.

For publishers, marketers, and SEO professionals, this means optimization must account for two dimensions simultaneously: appearing in results for the queries ChatGPT generates first, and ranking highly on the SERP when you do appear.

Below, we briefly discuss some of the key relationships between these variables before a longer section on action-oriented takeaways for marketers.

  1. Query fanout shows declining per-query effectiveness as the number of queries goes up.
    1. This means that when the number of search queries sent to a search index via ChatGPT goes up, each individual query is responsible for driving fewer citations (32.4% domain overlap per query for low-fanout, dropping to 12.7% for high-fanout).
  2. Query position is highly influential against citations; sources retrieved from earlier query positions are prioritized in citations.
    1. The average first-query drives 34.4% domain overlap, nearly double the 17.6% rate of subsequent queries.
  3. SERP position and query position alike become less relevant to ChatGPT's citation criteria as we move on to later query positions.
    1. We use the term 'gradient compression' to characterize the narrowing gap in citation rates between top (#1) and bottom (#10) ranked domains, meaning agentic search shows decreasing sensitivity to SERP position.
The same SERP position delivers different outcomes depending on query position
Citation Rate 0% 10% 20% 30% 40% 1 2 3 4 5 6 7 8 9-10 SERP Position 40.2% 12.7% 24.3% 11.9% 40.2% 24.3% First Query Last Query 1.7×

Key Findings for Marketers

Track Citation Paths, Not Just Rankings

You can no longer afford to rest on the laurels of your SERP rank.

  • The interaction effect between query position and SERP position means you must optimize across two dimensions simultaneously, appearing on the SERP for the first query position and ranking highly (SERP position).
SEO vs AEO: Amplified Position Effects

The first query in the fanout drives the most citations 67% of the time.

  • Being surfaced late in the query fanout while ranking low on the SERP creates compounding visibility penalties that don't exist in traditional search.
First Query Fanout Optimization is Non-Negotiable

Traditional SEO advises "rank for the right keywords." AEO adds query position specificity: rank for the first query in the fanout, which is the query users and LLMs would generate initially for a given prompt.

  • Example: The most cited query for project management software would be: "project management software" rather than "project management software for remote teams with time tracking".
Top-3 SERP Rank Drives First Query Citations

Every SERP position dropped on the first query counts for 2-3x more citation-rate loss than dropping a position on middle query SERP positions.

  • The 27.5pp gradient on first queries means they're especially position-sensitive. Ranking #3 on the first query (26.0%) beats ranking #1 on last query (24.3%).
  • Dropping from position 1 to position 6 on first queries creates a 23.5pp loss in citation-rate.
Cumulative Coverage Only Requires a Few Topic Clusters

In 67% of multi-query prompts, the first query serves as the primary citation source (contributing more citations than any other query position).

  • Cumulative domain overlap, the total domain overlap across all search queries for a given user prompt, reaches 45.6% by query 5.
  • However, 92% of the max observed cumulative domain overlap comes from queries 1-3 (43.9 of 47.6 percentage points), meaning it's especially productive to optimize SERP rank for the first 3 queries.
URL Precision Reflects Content Specificity

If you want a specific article to get picked up by LLMs, it's most useful to align the contents of the article with the relevant fanout's first query results.

  • The first query position enjoys an 18% URL overlap rate, which is 3x higher than the middle query URL overlap rate.

2. Query Position: The Foundation of Attribution

Understanding Query Position

When ChatGPT processes complex prompts, it often results in a query fanout, multiple sequential searches to a search-index, rather than answering from a single query. We find that query position has a substantial impact on the resulting citations:

First query: The initial search ChatGPT performs to understand the topic. Typically uses the most obvious, direct formulation.

  • Example: User asks "best project management software" → First query is "project management software"

Middle queries: These queries tend to be refinements and explorations as ChatGPT gathers more information. These queries have narrower focus and explore specific angles.

  • Example: "project management software for remote teams" or "asana vs monday comparison"

Last query: This query is the final search conducted before ChatGPT synthesizes its answer. It's also often the most specific or used to address remaining gaps.

  • Example: "project management software pricing 2025"

First-Query Position Dominates Citation-Share

Domain overlap peaks at the first query position (34.4%), then declines sharply to a low point at middle queries (17.6%). Last queries show a modest uptick to 21.5%, suggesting ChatGPT conducts final 'gap-filling' searches after synthesizing earlier results, looking for missing information to complete its response.

Query Position Overlap Rates
Overlap peaks at first query and generally declines through the sequence
Domain Overlap 0% 10% 20% 30% 40% 34.4% First 23.8% Second 20.1% Third 17.6% Middle 21.5% Last Query Position
Query PositionDomain OverlapPerformance vs First Query
First queries34.4%Baseline
Second queries23.8%-10.6%
Third queries20.1%-14.3%
Middle queries17.6%-16.8%
Last queries21.5%-12.9%

Primary Query Attribution

Among the 323 multi-query prompts analyzed:

Which Query Position Contributes Most Citations?

First queries dominate: 67.2% of prompts cite first-query sources most heavily

First: 67.2% 217 of 323 prompts Middle: 28.2% 91 of 323 prompts Last: 4.6% 15 of 323 prompts First (67.2%) Middle (28.2%) Last (4.6%)
Query PositionPrimary AttributionPercentage
First queries217 prompts67.2%
Middle queries91 prompts28.2%
Last queries15 prompts4.6%

This 67% concentration reveals that 2 out of 3 times, the initial search determines the plurality of which sources get cited. Even when ChatGPT performs extensive research with 5-10 queries, the first query position still dominates final citation attribution.

Why this matters: You don't need to necessarily optimize for every possible query variation ChatGPT might generate. Win the first obvious, direct search and you capture the majority of citation opportunities.

Position Effects Across Fanout Levels

The position advantage remains remarkably consistent across different levels of query fanout. First query positions maintain a ~13pp edge whether a prompt triggers 2 queries or 10 in its fanout.

  • This universality suggests position effects are fundamental to LLM search architecture, rather than being driven by specific query types.

First-Query Advantage Across Fanout Levels

Position effects remain consistent (~13pp) regardless of total query count

First
Non-First
0% 10% 20% 30% 40% 50% Low (1-3) Medium (4-6) High (7+) Fanout Level Domain Overlap (%) 35.0% First Query 38.1% First Query 25.4% First Query
Fanout LevelFirst Query Non-First QueriesDifference
Low (1-3 queries)35.0%22.2%+12.8pp
Medium (4-6 queries)38.1%24.3%+13.8pp
High (7+ queries)25.4%11.8%+13.6pp

3. The Fanout Efficiency Curve

Fanout level measures the total number of queries ChatGPT performs for a prompt, revealing how search behavior changes with question complexity.

  • Low fanout (1-3 queries): indicates simple, straightforward questions.
  • Medium fanout (4-6 queries): suggests moderate complexity.
  • High fanout (7+ queries): represents more complex prompts requiring extensive research.

Fanout Efficiency Declines as Query Count Goes Up

As query count increases, individual query effectiveness (as measured by domain overlap rates) declines while cumulative coverage grows:

Average Per-Query Overlap by Fanout
Individual query performance decreases: 32.4% (low) → 26.2% (medium) → 12.7% (high fanout)
Overlap (%) 0% 10% 20% 30% 40% 1-3 4-6 7+ Query Fanout 32.4% Low fanout (1-3) 26.2% Medium fanout (4-6) 12.7% High fanout (7+)
Fanout LevelPer-Query OverlapDecline from Low Fanout Level
Low (1-3 queries)32.4%Baseline
Medium (4-6 queries)26.2%-19%
High (7+ queries)12.7%-61%

We view two factors as responsible for driving this 61% decline at the high fanout level:

  1. Low-fanout prompts consist mostly of first/second queries (high-performing positions).
  2. High-fanout prompts tend to encourage increasingly divergent search spaces. More simply put, these prompts try different search approaches to form an educated opinion on a subject.

Cumulative Coverage: The Other Side of Fanout

While individual queries become less effective, pooling multiple queries increases cumulative domain overlap. Said differently, each later query contributes fewer citations on an individual level, but in aggregate they contribute up to ~25% of ChatGPT's information diet for a given response.

Cumulative Coverage: The Other Side of Fanout

While individual queries become less effective, pooling multiple queries compounds coverage. This measures total unique domains captured when combining the first N queries together.

Domain Overlap Percentage 30% 35% 40% 45% 50% First 1 First 2 First 3 First 4 First 5 Total Queries Pooled 35.8% Marginal Gain: — 40.8% Marginal Gain: +5.0pp 43.9% Marginal Gain: +3.1pp 44.8% Marginal Gain: +0.9pp 45.6% Marginal Gain: +0.8pp
Queries IncludedCumulative CoverageMarginal Gain
Query 1 alone35.8%--
Queries 1-240.8%+5.0pp
Queries 1-343.9%+3.1pp
Queries 1-444.8%+0.9pp
Queries 1-545.6%+0.8pp

Beyond three queries, marginal gains shrink below 1pp per query addition. The first three queries capture 92% of achievable five-query coverage (43.9% of 47.6% maximum), suggesting an asymptotic overlap relationship. Domain overlap plateaus after 5 searches and more web-search doesn't necessarily change that.

Queries IncludedCumulative CoverageMarginal Gain
Query 1 alone35.8%--
Queries 1-240.8%+5.0pp
Queries 1-343.9%+3.1pp
Queries 1-444.8%+0.9pp
Queries 1-545.6%+0.8pp

4. SERP Position × Query Position: The Interaction Effect

We studied how three variables determine citation probability:

  1. Query fanout (the list and number of search terms sent via ChatGPT queries)
  2. Query position (the order in which queries are sent via ChatGPT)
  3. SERP position (rank 1-10 in Google search results)

This section examines how SERP position bias varies across different query positions. We find that SERP position effects are not constant across each position in the query fanout; instead, the SERP position's influence appears to decrease as query position goes up.

Understanding Gradients

Gradients are one lens for quantitatively understanding how the 'citation-rate spread' changes across a variety of SERP positions and query positions. A gradient measures the difference in citation rate between SERP position #1 and SERP position #10 at a given query position:

Steep gradient = SERP position matters intensely (large difference exists between top and bottom SERP position's citation rate).

Flat gradient = SERP position matters less (small difference, citations are distributed more evenly across SERP positions).

Flattening Gradient Compression

We analyzed 2,867 query instances, measuring citation rates stratified by query position (first, second, third, middle, last) and SERP position (#1-10).

The interaction is statistically robust and reveals systematic gradient decay, which means that ChatGPT is less and less SERP position-sensitive as it moves past the first query position in the fanout.

One useful way of approaching this is viewing the gradient as part SEO and part AEO. The framing allows us to identify parts of the gradient where low citation rates signal that agent information preferences don't align with SEO expectations, signaling the influence of AEO.

We find that SEO style optimization (keyword targeting, rank optimization) matters most for the first query position in the fanout because of the high citation rate for top-ranked SERP results. AEO (or agent preferences), on the other hand, kicks in for subsequent query positions where the average citation-rate is much lower. In the gradient below, we characterize 'SEO' as the top 10% of all citation-rate values, where the bottom 90% are 'AEO'.

Traditional vs Agent-Optimized Attribution
Green cells follow traditional ranking logic—gray cells require optimization for agent search behavior
1
2
3
4
5
6
7
8
9
10
First Query
40.2
33.1
26.0
20.4
20.4
16.7
16.9
14.2
13.4
12.7
Second Query
30.4
22.0
20.5
15.5
18.9
12.1
10.1
12.7
13.4
11.0
Third Query
26.5
20.6
17.4
15.3
12.2
9.8
11.5
9.4
11.0
7.8
Middle Queries
13.9
14.6
9.5
9.5
10.3
8.1
7.5
8.1
5.0
4.7
Last Query
24.3
12.9
14.7
12.1
10.3
9.2
9.3
10.9
6.1
11.9
SERP Position (1-10)
Traditional Ranking Logic
Top-performing cells where Google SERP position predicts LLM citation
Agent Search Behavior
Requires optimization for multi-query discoverability and synthesis patterns

The Gradient Decay Pattern

The first-query gradient (27.5pp) is 2.2× larger than the last-query gradient (12.4pp); This means SERP position matters most when the query position already favors you.

Gradient Compression Across Query Sequence
Position sensitivity shrinks as queries progress—SERP rank matters most on first queries
Gradient (percentage points) 0 10 20 30 27.5pp First 40.2% → 12.7% Steep Gradient Position matters most 19.4pp Second 30.4% → 11.0% Moderate Gradient 29% compression 18.7pp Third 26.5% → 7.8% Similar to Second 32% compression 9.2pp Middle 13.9% → 4.7% Flat Gradient 67% compression 12.4pp Last 24.3% → 11.9% Slight Recovery 55% compression Query Position in Sequence

To illustrate the difference between SEO methodology and ChatGPT retrieval methodology, the graph below has webpage click-through-rates (CTR) by position vs ChatGPT citation rate by query order and position.

SEO vs AEO Comparison
SEO vs AEO: Position Effects
Traditional SEO: constant. AEO: timing-dependent.
Traditional SEO
Predictable
Position 1
29%
Position 3
11%
Position 5
6%
Position 10
2%
AEO (ChatGPT)
Timing-Dependent
Position 1, First Query
40.2%
Position 3, First Query
26.0%
Position 1, Last Query
24.3%
Position 5, Last Query
10.3%

Traditional SEO optimizes a single dimension, SERP position, with predictable, consistent effects. In traditional search, position 1 always beats position 5 by roughly the same margin.

AEO requires optimizing a two-dimensional surface (rank higher + appear earlier in query sequences) with compounding, position-dependent effects. Counterintuitively, position 1 on query 5 can underperform position 3 on query 1.

The same SERP position delivers dramatically different citation outcomes depending on query position, which is a dynamic that doesn't exist in traditional search. SEO CTR ranges are averages taken by pooling the ranges found in AWR (2024), Sistrix (2020).

5. Domain vs. URL Capture Rates

Domain overlap measures whether ChatGPT and Google consult the same publishers ("nytimes.com").

URL overlap measures whether they cite the same specific articles: nytimes.com/article1. The gap reveals how LLM search selects sources.

High-level summary statistics can be found below:

MetricRateInterpretation
Mean domain overlap20.6%Publisher-level agreement
Mean URL overlap8.3%Article-level agreement
Coverage gap12.3ppThe difference in agreement on publishers but not specific articles
Zero URL overlap37% of promptsCompletely different articles despite domain matches

Position Effects Show Strong Influence at the URL Level

First queries achieve 3× higher URL overlap compared to 2× domain advantage relative to middle queries. This suggests ChatGPT's synthesis phase prioritizes articles found from the first query position in the fanout.

URL vs. Domain Overlap by Position

Position effects amplify at URL level: first queries show 3× advantage vs 2× for domains

0% 10% 20% 30% 40% First Middle Last 34.4% Domain Overlap 18.3% URL Overlap 17.6% Domain Overlap 6.2% URL Overlap 21.2% Domain Overlap 7.8% URL Overlap
Query PositionDomain overlapURL OverlapURL Advantage vs Middle
First queries34.4%18.3%~3x
Middle queries17.6%6.2%Baseline

Cumulative URL Coverage

URL overlap gains from additional searches plateau faster than domain coverage (5.7pp total gains for URL overlap vs 9.8pp gains for domains across 5 queries cumulatively), indicating later queries in the fanout explore novel, unrelated content rather than reinforcing specific articles that were retrieved earlier in the fanout.

Cumulative URL Coverage: First N Queries

Pooling multiple queries provides modest URL coverage gains. First query captures 16.2%, reaching 21.9% by five queries.

URL Overlap Percentage 12% 15% 18% 21% 24% First 1 First 2 First 3 First 5 Total Queries Pooled 16.2% Marginal Gain: — 19.5% Marginal Gain: +3.3pp 20.8% Marginal Gain: +1.3pp 21.9% Marginal Gain: +1.1pp

6. Why It Matters

The shift from traditional search to AI-mediated discovery fundamentally changes how content gets found. Ranking #1 in Google for your recommended keywords no longer guarantees visibility if you're absent from the first query ChatGPT generates. The interaction between query position and SERP position creates a two-dimensional optimization challenge that traditional SEO tools aren't built to solve.

Brands that understand query position dynamics can capture 2-3× more citations than competitors, drive 25-50% more qualified lead growth through the most high-intent, organic customer acquisition channel via LLMs.

Companies that take AEO seriously will have the opportunity to fundamentally reshape their growth and marketing economics before competitors know how to observe the channel, much less take actions against it.

As LLM search continues to reshape discovery, content strategies must evolve beyond SERP position tracking to include query position analysis.

Profound's Answer Engine Insights helps teams get visibility into the relevant query fanouts to show up for, query positions to optimize for, and tactical insights to take control of your AEO visibility.

Reach out to our team to see how Profound can help you capture the first-query advantage.

7. Methodology & Data Specifications

Sample Composition

  • Total prompts: 420 (97 single-query, 323 multi-query)
  • Query positions analyzed: 2,867 across all prompts
  • Position analysis: Limited to 323 multi-query prompts (2,544 query positions) since single-query prompts lack position distinction.
  • Fanout strata:
    • Low (1-3 queries): 155 prompts (36.9%)
    • Medium (4-6 queries): 185 prompts (44.0%)
    • High (7+ queries): 80 prompts (19.0%)

Measurement Framework

Core metrics defined:

  • Domain overlap: Percentage of ChatGPT citations where the base domain appears anywhere in corresponding Google SERP top 10
  • URL overlap: Percentage where the specific article URL exactly matches
  • Query position: Ordinal sequence in ChatGPT's multi-query pattern
  • SERP position: Google rank (1-10) where matched citation appeared
  • SERP gradient: Citation rate difference between position #1 and #10 within a query position category
  • Per-Query Overlap: Average percentage of ChatGPT citations matched by each individual query's Google results. If a 5-query prompt shows overlaps of [15%, 20%, 18%, 22%, 25%], the per-query average is 20%.
  • Cumulative overlap: Union of unique domains/URLs across first N queries
  • Primary query attribution: Which single query position contributed most citations to final answer

Data Flow Example

From Prompt to Overlap Calculation:

  1. User Prompt: "What companies help enterprises build AI talent pipelines?"
  2. ChatGPT Query: "companies help enterprises build AI talent pipelines"
  3. Google SERP (Top 10): phenom.com, cio.com, gogloby.io, wharton.upenn.edu, hbr.org...
  4. ChatGPT Citations: microsoft.com, futurense.com, wikipedia.org, allegisgroup.com

Result: Domain Overlap: 0/4 = 0% | URL Overlap: 0/4 = 0%

Domain Convergence | URL Divergence Example

ICANIWILL Athletic Wear Query

Prompt: "Are ICANIWILL clothes good for high-intensity training?"

Queries triggered: 3

Google Results:

  • ✓ trustpilot.com
  • ✓ trustpilot.com
  • ✗ reddit.com

ChatGPT Citations:

  • ✓ trustpilot.com/dk
  • ✓ trustpilot.com/fi
  • ✓ trustpilot.com/de

Result: 100% domain overlap, 0% URL overlap

Both systems trust Trustpilot, but cite completely different review pages, demonstrating domain consensus without article alignment.

Limitations and Scope

Temporal: Data represents October 2025 behavior; LLM algorithms and SERP patterns evolve continuously and are subject to change.

Sample: 420 prompts span diverse topics but cannot capture all query types; gradient patterns may vary by: vertical, query type, or length.

Single LLM: Analysis examines ChatGPT-4 with search; other LLMs may exhibit different behaviors.

SERP depth: Analysis limited to top 10 Google results; it does not capture positions 11+.

Causality: Observational study identifies correlations; true causal mechanisms require controlled experimentation.