"AI search visibility" is a real thing to measure, but most of the dashboards built around it are measuring the wrong things — or measuring the right things in ways that don't survive contact with how the AI engines actually behave.
The single biggest mistake we see: monitoring built around mention counts with no relative context. "We were mentioned 42 times this month" is a number. It is not a signal. Without a denominator (out of how many tracked queries?), a competitor set (who else was mentioned for the same queries?), and an engine breakout (mentioned where?), it is the AI-era equivalent of "we had 10,000 impressions" — a vanity metric.
Here's what monitoring should actually track, what to ignore, and how to operate the data.
The five metrics worth tracking
1. Citation share, per engine
The percentage of your tracked query set where the brand earns a clickable citation — broken out by engine. The breakout matters because ChatGPT, Perplexity, Claude, Gemini, and Google AI Overviews each have different citation behavior. Winning ChatGPT and losing Perplexity is a different problem than winning Perplexity and losing ChatGPT, and aggregating across engines obscures both.
We document the citation share methodology — query set construction, engine sampling cadence, citation parsing — on the citation share monitoring page.
2. First-mention position
When the brand IS cited, where in the citation list is it? First, third, seventh? AI engines weight earlier citations higher in user attention and (in many cases) in answer composition. A brand that's cited last in a five-source list is in a weaker position than a brand cited first in a three-source list, even if both have the same citation rate.
We measure first-mention position as a percentage: of the queries where the brand is cited, what percentage of those is the brand cited first.
3. Mention rate (without citation)
Mention rate is the percentage of tracked queries where the brand is mentioned in the answer body — whether or not a clickable source is attached. AI engines sometimes mention a brand in the body of the answer without linking to its source. That's visibility for the user but no traffic for the brand.
The gap between mention rate and citation rate is itself a metric. A brand mentioned 60% of the time but cited only 30% has a source-document problem (the engine knows about you but isn't citing your pages as the source for what it's saying). The fix is on-site: schema, content depth, source-document structure.
4. Competitor citation share, on the same query set
Citation share without competitor context is uncalibrated. If our client is cited for 35% of queries in their category, is that good? You don't know until you measure the named competitors on the same query set on the same engines on the same day.
The competitor view answers two questions:
- Are we gaining or losing share relative to peers?
- Which competitors are gaining, and what are they doing differently?
We track named competitors at the same cadence as the client, with the same query set, on the same engines. That's the only way the share number means anything.
5. Brand entity accuracy, per engine
Citation and mention measure presence. Entity accuracy measures whether what the engine is saying about the brand is true. Each quarter (sometimes monthly for clients in fast-changing categories), we ask each engine direct entity questions — what does the brand do, where is it headquartered, what does it sell, who runs it — and we score the answers for factual correctness.
This is the metric that catches hallucinations early. We've seen AI engines confidently misstate a brand's founding date, founders, headquarters, and product line — often pulled from outdated third-party sources the engine still trusts. The entity accuracy track is what surfaces those, so the underlying sources can be corrected.
The three metrics to stop tracking
"Mentions" with no denominator
A raw count of mentions across all engines, across all queries, with no denominator and no comparison set, is a vanity metric. It will trend up over time because the AI ecosystem itself is growing and the data sources are expanding. That trend reflects almost nothing about the brand's actual performance.
If your dashboard says "523 brand mentions this month" with no other context, that line should be deleted.
"AI traffic" as a single aggregated number
Some teams roll all AI-driven referral traffic into a single number, then track its growth. The problem is that the engines route traffic very differently — some pass referrer headers cleanly, some don't, some pass user-agent strings that get filtered out of analytics by default, some don't. The single aggregated number is composed of measurements from different engines with different fidelities, smoothed into something that looks tidy but isn't comparable.
Track AI traffic per engine, with the known measurement caveats, or don't track it at all.
Generic "AI sentiment scores"
There is a category of dashboard that scores "how AI engines feel about your brand" on a 1-10 sentiment scale. The number is usually derived from a NLP pass over the engines' answer text, applied uniformly across all queries.
The problem: the engines don't have sentiments. They have answers, and those answers are factually correct or factually wrong, complete or incomplete, current or stale. An "AI sentiment score" abstracts those concrete properties into a number that's not actionable. We don't use it. We've never seen a meaningful product decision driven by it.
What the data is for
Tracking is not the work. The work is what you do with the tracking. The four operational outputs of a healthy AI visibility monitoring program:
- A weekly priority list — which engines, which queries, which gaps are urgent enough to address this week.
- A monthly competitor diff — what competitors gained, what we lost, where the content roadmap needs to adjust.
- A quarterly entity audit — what each engine believes about the brand, what needs correction, which source documents need updating.
- A semi-annual strategic review — which engines deserve more attention, which deserve less, where the brand's overall positioning needs to evolve.
If your dashboard produces numbers but not those four artifacts, the monitoring isn't doing its job.
Where to start
Most brands we engage with have either no AI visibility monitoring at all, or monitoring that overproduces metrics and underproduces action. The starting point is the query set: 50 to 200 queries that represent the actual intent the brand needs to win on, drawn from real customer questions and real search data, not invented in a brainstorming session.
The query set is the anchor for everything that follows. Citation share is measured against it. First-mention position is measured against it. The competitor diff is measured against it. Get the query set right and the rest of the methodology has something stable to operate on. Get it wrong and the entire monitoring program is fitting itself to the wrong target.
We cover the full methodology — query set construction, engine sampling, citation parsing, competitor handling, entity audit cadence — on the citation share monitoring page. The most recent first-party data on what citation patterns actually look like in the wild is in /reports/state-of-ai-shopping-citations-2026.
If you want to talk through AI visibility monitoring for your brand specifically, we'd like to hear from you. For the content and technical side of moving the metrics once you can see them, see AI SEO services and AEO services.
