Summarize this article with AI
A VP Marketing at a $25M ARR analytics SaaS pinged me last Thursday: "Our competitor is showing up everywhere in ChatGPT, we are nowhere." First question I asked: how do you know? She did not. She had run two prompts in her browser on a Tuesday morning, seen Looker named twice, and forwarded the screenshots to her CEO. Two prompts is one observation of a system that changes 70% of the time. The screenshots were not data, they were anxiety with a logo on top.
Most teams skip the baseline and go straight to the tool purchase. Wrong order. The 30-minute manual baseline below tells you whether your citation gap is real or imagined, which engines actually matter for your category, and where the competitor is earning citations. Once that is clear, you can buy a tool with a specific job description.
Why competitor analysis in AI search is not SEO competitor analysis
Most SEO competitor frameworks shipped between 2010 and 2024 assume one thing: that the engines are reading roughly the same web. In 2026, they are not. Each engine pulls from a different retrieval pool with a different refresh cadence and a different weighting on third-party authority versus owned content.
| Engine | Refresh cadence | Google top-10 overlap |
|---|---|---|
| ChatGPT | 12 to 16 weeks | 6.5% (BrightEdge, 2026) |
| Claude | Quarterly+ (slowest) | Limited public data |
| Perplexity | 4 to 6 weeks (fastest) | 43.5% (Ahrefs, 2025) |
| Gemini | 8 to 12 weeks | Mid-range |
| Copilot | Inherits Bing, ~weekly | Bing-correlated |
| Google AI Overviews | 8 to 10 weeks | ~17% (citations from outside top 10 ~83%) |
upGrowth’s 2026 framework quantifies the consequence: 83% of Google AI Overviews citations come from pages outside the top 10 organic results. Translation: a competitor who outranks you on Google can still be invisible on AI Overviews, and a competitor invisible on Google can dominate a Perplexity answer. SEO position alone tells you almost nothing about citation share.
Pick 3 competitors and 3 tier buckets
Three real competitors. No more. Beyond three, the prompt grid becomes unworkable inside the 30-minute window, and the tier matrix loses its forcing function. Pick competitors that represent three structural threats: the incumbent leader, the fastest-growing peer, and the AI-native challenger.
Three tier buckets to organize the prompts:
- Tier A: named head-to-head. "HubSpot vs Salesforce." Forces the model to rank. Highest commercial signal, easiest to score.
- Tier B: category recommendation. "Best CRM for a 50-person B2B SaaS." Your name may not appear in the prompt. Tests whether you make the shortlist at all.
- Tier C: problem-led. "How do I stop losing deals at the proposal stage?" Long-tail. Often where small brands displace incumbents because the model has weaker priors.
Reference matrix (use these to sanity-check the method)
Plug your own competitors in, but verify the methodology in real time using one of these publicly observable head-to-heads first. Run any prompt in the next section and you should see results consistent with what we describe.
| Vertical | Brand 1 | Brand 2 | Brand 3 |
|---|---|---|---|
| Marketing / CRM | HubSpot | Salesforce | Pipedrive |
| Productivity | Notion | Linear | ClickUp |
| Payments | Stripe | Adyen | Checkout.com |
| Observability | Datadog | New Relic | Grafana |
| Analytics SaaS | Looker | Tableau | Mode |
The 30-prompt grid (verbatim)
Three tiers × ten prompts each. The phrasings below are the shapes that consistently force comparable answers across all six engines. Replace the bracketed placeholders with your actual competitors and category.
Tier A: direct head-to-head (10 prompts)
Forces the engines to rank. Use these verbatim:
Compare [Brand 1] vs [Brand 2] vs [Brand 3] for a [team size] [vertical]. Rank them in order and give one paragraph per tool with sources.[Brand 1] vs [Brand 2] vs [Brand 3] for [primary use case]. Which would you recommend and why? Cite specific reviews.Score [Brand 1], [Brand 2] and [Brand 3] on [criterion 1], [criterion 2] and [criterion 3]. Use a 1-to-10 scale.For a [team size] team running [workflow], compare the top 3 [category] tools head-to-head.Which is better, [Brand 1] or [Brand 2], for [specific workflow]? Be specific.List the top 3 [category] tools in 2026 and rank them by [your differentiator].[Brand 1] vs [Brand 2]: which one has better [specific feature]?Compare pricing and total cost of ownership for [Brand 1], [Brand 2] and [Brand 3] over 24 months for a [team size] team.Which [category] tool is the best fit for [vertical], [Brand 1], [Brand 2] or [Brand 3]?Rank [Brand 1], [Brand 2] and [Brand 3] for [specific integration] support.
Tier B: category recommendation (10 prompts)
Your name may not appear. Tests whether you make the shortlist.
What are the top 3 [category] tools for a [team size] [vertical] in 2026? List with one-line reasons and sources.Best [category] for a remote-first startup, 2026. Top 3 only.Best [category] tool with [specific capability], 2026.Recommend a [category] platform for a $[revenue] [vertical].What [category] tools have the highest customer satisfaction in 2026?Best mid-market [category] tools under $[price] per month.Top 5 [category] platforms for [specific use case].Which [category] tools have the best [feature] in 2026?Affordable [category] alternatives to [market leader].Best [category] tool for [specific integration].
Tier C: problem-led (10 prompts)
Long-tail, where small brands displace incumbents:
Our [team] keeps [problem]. What software stack would you recommend? Name specific products.How do I [specific job]? Recommend specific tools.We are [team size] and our [workflow] is a mess across [N] tools. What is the right consolidation?What is the best way to [specific outcome] for a [vertical]?How do I improve [metric] using [category] software?What tool helps with [specific pain point]?Recommend a workflow to [specific goal] using [category].Best practices to [outcome], with named tools.I am switching from [legacy tool], what is the modern replacement?How do I scale [process] from 10 to 100 [units]? Name the stack.
Run it across 6 engines, two runs each
Open one fresh incognito window per engine. No logged-in personalization. Run each prompt twice, ideally spaced across 48 hours, because LLMs are stochastic and a single run is one observation of a system with 70% answer churn.
Open six incognito tabs
Paste each prompt verbatim, twice per engine
For each answer, log five fields
Repeat for all 30 prompts × 6 engines × 2 runs
The competitor citation share scorecard
Copy this template into a Google Sheet. One row per cell (prompt × engine × run). At 360 rows the patterns become legible.
| Field | Example value | Why it matters |
|---|---|---|
| Date run | 2026-04-28 | LLMs drift. Date-stamp every cell. |
| Engine | ChatGPT-5 | Cross-engine SOV requires per-engine attribution. |
| Prompt ID | A-01 (Tier A, prompt 1) | Lets you re-run identical prompts on a quarterly cadence. |
| Tier | A / B / C | Tier C is where SMBs steal citations from incumbents. |
| Run | 1 of 2 | Stochastic, only 30% inter-run consistency (AirOps). |
| Your brand cited? | Y / N | Binary base. |
| Position | 1st named / mid / footnote | Position 1 gets 1.5 to 2× more consideration than position 3. |
| Sentiment | Positive / Neutral / Cautionary | Cautionary mentions hurt, separate from wins. |
| Cited domain (your brand) | g2.com, yourbrand.com, reddit.com/r/SaaS | Tells you which third party is doing the lifting. |
| Comp 1 cited? | Y / N | Same fields per competitor. |
| Comp 1 position + cited domain | 1st named / hubspot.com/blog/x | The third-party asset you need to displace. |
| Citation gap (pp) | (their SOM) − (your SOM) per engine | The weekly KPI. |
| Recovery hypothesis | Add 120-word answer block to /pricing | Tie every gap row to one editable paragraph. |
Compute share-of-model and the gap
The two derived numbers that close the loop. Compute both per engine, then average across engines for a cross-engine SOM.
# Per engine, per brand
SOM_engine = (cells where brand X appears) / (total cells run on engine) × 100
# Cross-engine
SOM_cross = average(SOM_engine across all 6 engines)
# Citation gap to leader
gap_pp = SOM_leader_competitor - SOM_yours
# Action threshold
if gap_pp >= 10 and engines_with_gap >= 4:
action = "automate the loop, buy a cross-engine tool"
elif gap_pp >= 10 and engines_with_gap < 4:
action = "engine-specific content sprint"
elif gap_pp < 10:
action = "ship 2-3 paragraph rewrites, re-baseline in 4 weeks"B2B SaaS leadership threshold is roughly 25%+ cross-engine SOM (upGrowth, 2026). Most B2B SaaS sit at 15 to 20%. If you are below 10% cross-engine and your top competitor is above 25%, you have an addressable but real gap. The recovery target is the 10-point band, the recovery horizon is two quarters.
Tag the citation source for every win
The source domain column is the most actionable column in the sheet. Cluster wins by source type:
- Owned domain wins (yourbrand.com cited): your content engine is doing the work. Double down on the page shapes that win.
- G2 / Capterra wins: review platform momentum. If competitors win here and you do not, you have a review volume problem.
- Reddit / Quora wins: watch the date. Post- September 2025, ChatGPT deweighted Reddit aggressively. Reddit wins observed pre-Q4 2025 may already be decaying.
- LinkedIn / Forbes / news wins: rising. The slice that absorbed the redistributed Reddit / Wikipedia citation share. Invest here in 2026.
- Comparison-blog wins (third-party listicles): invest in placement on the listicles your competitor wins on. Eighty-five percent of brand mentions come from third-party pages (AirOps), this is the cheapest displacement vector.
Engine-specific gotchas
Each engine has a personality. Skipping these calibrations is how you ship a "competitor X dominates" report that is actually an artifact of your prompt phrasing.
| Engine | Watch out for | Calibration |
|---|---|---|
| ChatGPT | Source mix shifted hard in Sept 2025 (Reddit -50pp, Wikipedia -35pp) | Re-baseline if your data predates Oct 2025 |
| Claude | Refuses to rank named competitors ~40% without scaffolding | Prepend procurement-analyst persona prompt |
| Perplexity | Refresh cadence is 4 to 6 weeks, fastest of the six | Re-baseline monthly, weekly if you ship a content sprint |
| Gemini | Under-cites by default, favors LinkedIn | Add: 'Cite at least 3 external sources per recommendation' |
| Copilot | Inherits Bing's source bias, results correlate with Bing rankings | If you rank well on Bing, expect Copilot wins |
| Google AI Overviews | 83% of citations from outside Google top 10 | SEO position is uncorrelated with AIO citation, do not assume overlap |
A competitor that wins on one engine is winning a contest. A competitor that wins on five is shipping a strategy.
When to upgrade from manual to a tool
The manual baseline is honest, defensible, and free. It is also a one-time exercise. Once you have proven the gap is real, the weekly maintenance becomes the bottleneck. The break-even is consistent across the customer set we have audited:
- Sustained 10+ point gap across 4 or more engines. You need automated cross-engine tracking, the manual loop cannot keep up week-to-week.
- 10+ hours per week on the manual loop. At $80/hr loaded analyst cost, the manual loop costs $3,200/mo in labor. A $49 to $499 monthly tool subscription is 6 to 60× cheaper.
- Need for paragraph-level rewrite suggestions. Once you know which prompts you lose, the next question is "rewrite which sentence on which page." Manual analysis cannot tell you that, but tools that pair measurement with operative recommendations can.
Where to go deeper
This article sits inside the GEO measurement cluster. The companion playbooks below cover the full measurement stack, the tooling teardown, and the weekly operator’s loop.
- Best GEO Tools 2026: Honest Teardown of 9 Platforms ships verdict slots by ARR stage and the 90-day ROI formula. Pair with this article to decide which tool you actually need.
- How to Measure GEO Performance: The Weekly Operator’s Playbook covers the 200-prompt set, ±2 SD noise floor and 90-minute Monday loop. The weekly cadence sits on top of this baseline.
- GEO Tools and Analytics: Complete Measurement Guide is the pillar that defines the four metrics (citation share, engine spread, passage prominence, answer churn) you will use in the scorecard above.
- How to Do GEO in 2026: The 12-Week Playbook is the upstream piece that ties the measurement work to the content sprint you run in parallel.
The competitor that worries you most in your weekly Slack is rarely the one that beats you in AI search. Run the baseline. The actual leader is usually the one you forgot to track.







