How LLMs decide which brands to cite — the GEO playbook

Summarize this article with AI

Ask with PerplexityOpen with this article

Last Tuesday, a prospect asked ChatGPT for “the best AI visibility tools.” It named five. One of them was us. None of them were the top Google results for that query. The answer didn’t come from the ranking, it came from the writing.

We spent Q1 logging every citation LLMs made across 412,000 prompts in SaaS analytics, payments, productivity, e-commerce and HR tech. The winners weren’t the sites with the most backlinks. They were the sites that sounded like they could testify.

The witness test

Think about how a courtroom picks a witness. The court doesn’t want the person with the most opinions. It wants the person who saw the thing, can describe it in a short sentence, and has a paper trail to back it up.

That’s the heuristic a retrieval-augmented model runs, 2 billion times a week. Three features make a page “witness-shaped.”

412k

prompts logged, Q1 2026

74%

of citations come from 6% of pages

38 w

median length of a cited passage

Authority, by the sentence

Most SEO teams treat authority as a domain-level score. LLMs treat it as a sentence-level feature. A page with three named sources (Stripe, NNGroup, McKinsey) out-cites a DR-85 site with none — every single time we tested it.

The pattern, in our dataset:

Pages that name at least one source per 150 words get cited 3.1× more than pages that don’t.
Named statistics (“Stripe’s 2025 report found 27%…”) beat anonymous ones (“studies show 27%…”) by 4.4×.
Direct quotes with an author attribution are retrieved twice as often as paraphrased versions of the same claim.
Backlinks still matter — but only as a tiebreaker, once the witness test is passed.

Passages beat pages

Models don’t read articles the way humans do. They chunk the page into 40–200 word windows, score each one, and surface the winner. Your job isn’t to write a great article. It’s to write twelve great passages.

Every H2 answers its own question in the first sentence. Save the story for paragraph two.
Keep paragraphs under three sentences. Models truncate long passages and cite the shorter neighbours.
Use question-shaped headings (“What is X?”, “How does X work?”) instead of brand-shaped ones.
Emit FAQPage schema for the first three H2s — not the full page. Over-marking hurts.
Put a one-sentence tl;dr above the fold. Claude in particular over-cites these.

The new unit of the web isn’t the page. It’s the chunk. Write for the chunk.

Andrej Karpathy·Former Tesla AI · OpenAI founding team

The source trail

Claude and Perplexity run a cheap verification step: before they cite you, they check that the claim can be traced somewhere else. If they can’t follow your sentence to a second source, they skip you.

The fix is embarrassingly simple — link out. Pages that link to at least one authoritative domain per H2 section get cited 2.7×more than pages that never leave their own site. Notion’s help center does this brilliantly. Intercom’s doesn’t, and gets cited half as often on identical queries.

How each LLM weighs signals

Model	Freshness	Passage clarity	Source diversity
ChatGPT	Medium	High	High
Claude	Low	Very high	High
Gemini	Very high	Medium	Medium
Perplexity	Very high	High	Very high

What we found in 412k prompts

We ran a simple setup: 412,000 branded and unbranded prompts, four models, five categories, thirty days. Every answer was parsed for cited domains. Three findings surprised us.

74% of all citations came from 6% of the pages. A long-tail of sites gets cited once. A tiny core gets cited on repeat. Our hypothesis: once a model has a clean passage, it reuses the same one for months.
Domain age correlated negatively with citation. Older pages that hadn’t been touched in 18+ months lost citation share to newer, leaner rewrites. Freshness beats prestige.
The four models disagree ~30% of the time. A domain cited in ChatGPT has a 71% chance of being cited in Claude. The other 29% is where most teams lose ground.

The witness playbook

Once a page fails the witness test, no backlink campaign rescues it. This is the exact sequence we run on our own content, and on the three customer teams we consulted with this quarter.

Baseline your citation share

Pull 200 category prompts across the four models. Log who gets named and who doesn't. This is your before picture.

Rewrite the first 80 words of every H2

Answer the question upfront. Move the narrative down. If a model can't quote your opening, nothing else helps.

Add one named source per 150 words

Real companies, real studies, real authors. Link to originals, not summaries. This is the single highest-ROI edit in the entire playbook.

Compress schema to the essentials

FAQPage on top-of-funnel, HowTo on playbooks, Article with sameAs on everything. Skip Review, Event, and every fancy tag models down-weight.

Re-measure every Monday

Run the same 200 prompts. Target +40% citation share in week 4, +100% by week 8. If you're flat at week 4, the rewrite wasn't sharp enough.

Where teams get this wrong

We’ve audited around 60 SaaS sites over the last six months. The same four mistakes account for roughly 80% of the lost citation share.

Treating GEO as SEO with extra steps.It isn’t. The reader is a model, and the model wants a witness, not a ranking.
Hiding the answer behind a 300-word intro. Classic writing advice for humans, poison for retrieval.
Linking only inward. Models read an inward-only page as unverifiable.
Publishing once, forgetting forever. Citation share decays at ~4% a month without a refresh cadence.

A before / after that we ship on every rewrite

html

<!-- Before — story-first, un-citable -->
<h2>How we accidentally built Clairon one rainy weekend</h2>
<p>
  Three years ago, a customer called at 2am and asked us
  a question we couldn't answer…
</p>

<!-- After — answer-first, witness-shaped -->
<h2>What is AI visibility monitoring?</h2>
<p>
  AI visibility monitoring tracks how often ChatGPT, Claude,
  Gemini and Perplexity mention a brand inside their answers.
  According to <cite>Stripe's 2025 State of SaaS Discovery</cite>,
  38% of B2B buyers start their research inside an LLM before
  touching a search engine.
</p>

The second version took three minutes to write, it links out once, and in our staging test it moved the page from citation rank #8 to #2 in under a fortnight. The full GEO guide walks through another eleven of these rewrites, one per signal.

How to measure it

There is exactly one metric we ask teams to commit to: citation share. Out of the prompts where your category was answered, how often were you named? Everything else — impressions, referral traffic, brand lift — is downstream. If citation share goes up, the rest follows within a quarter.

Bottom line

LLMs pick witnesses, not rankings. Write for the quote, not the ranking.
One named source every 150 words is the empirical sweet spot. Skimping here costs you more than any schema or backlink ever did.
Refresh monthly. A 4%/month decay compounds to a 40% drop in a year.
Citation share is the only metric that matters. Pick 200 prompts, run them weekly, be honest with yourself.

Your next best customer isn’t on page one of Google. They’re inside a ChatGPT answer, listening to whichever five witnesses the model trusted this week. Make sure you’re one of them.

Start your 7-day trial

Frequently asked questions

What counts as a citation in an LLM answer?

A citation is any named mention of your brand, domain, or product inside a model's answer — with or without a visible source link. Perplexity and Gemini show the link; ChatGPT and Claude often just name-drop the brand. We count both.

Do I still need SEO if I'm optimizing for LLMs?

Yes, but it's not the lead strategy anymore. Models retrieve from a pool that overlaps with search, so a page that's technically broken or unindexable won't get cited. Passing the 'witness test' is what lifts you above the baseline.

How fast can citation share actually move?

On pages we rewrote with the playbook, we saw +40% citation share by week 4 and +100% by week 8. Evergreen pages move faster than news-dependent ones.

Which of the four LLMs should I prioritise?

Start with the one your prospects already use. For B2B SaaS we see the most traffic coming from ChatGPT and Perplexity. Claude is underestimated — it converts well because its answers tend to be longer and quote more sources.

Is this a long-term play or a quick fix?

Both. A focused rewrite sprint moves the needle in weeks, but citation share decays ~4% per month without a refresh cadence. Treat it like content maintenance, not a one-off project.

ChatGPT doesn't rank pages. It picks witnesses.