- AI Visibility Blog
- Measuring AI Visibility: The 5 Metrics That Actually Matter
Measuring AI Visibility: The 5 Metrics That Actually Matter
Measuring AI Visibility: The 5 Metrics That Actually Matter
TL;DR — Most AI visibility dashboards track vanity metrics — number of citations, raw mention counts — that don't tie back to business outcomes. After working with 40+ brands on GEO measurement, we've converged on five metrics that actually drive decisions: Share of AI Voice (SOAV), Sentiment Score, Query Coverage Breadth, Citation Depth, and AI-Sourced Conversion Rate. Each is specific enough to measure, actionable enough to iterate on, and generalizable enough to work across Chinese AI platforms.
The vanity metrics to stop tracking
Before what to measure, what to ignore:
Total citation count. Without a denominator (how many queries ran, over what period, on what platforms), this is meaningless. A brand with 10,000 citations across 100,000 queries looks the same as 10 citations across 100 queries in raw count.
Follower or subscriber count on AI platforms. With few exceptions, AI assistants don't have "followers." Metrics that measure user subscription to the platform itself (Kimi users, Doubao MAU) are platform-level indicators, not brand-level.
Brand name search volume. This measures intentional brand search, not AI-mediated discovery. It's useful as a secondary signal but doesn't represent AI visibility.
Social engagement on content you publish. Likes and shares on your Douyin or WeChat content are marketing signals, not AI citation signals. Sometimes correlated, not causally connected.
Now the five that work.
Metric 1: Share of AI Voice (SOAV)
What it measures: The percentage of AI responses to category queries that mention your brand, compared to competitors.
How to compute: Select 20-50 representative category queries ("best smart home brands", "top Chinese AI platforms", etc.). Run each through your target AI platforms (DeepSeek, Doubao, Yuanbao, Qwen, Kimi, ERNIE) weekly. Count how many responses mention your brand. Divide by total responses.
Why it matters: This is the most directly comparable metric to traditional share of voice (SOV) from earned media measurement. It translates AI visibility into a familiar marketing KPI and benchmarks you against competitors.
Interpretation:
- SOAV below 10% means you're effectively invisible in AI responses for that category
- SOAV 10-30% means you're a credible but not dominant player
- SOAV above 50% means you're a category leader in AI responses
- Track changes quarter-over-quarter; the trajectory matters more than the absolute number
Common pitfall: Gaming the query list. If you only measure SOAV on queries where you already perform well, the metric inflates. Curate queries based on business relevance, not current performance.
Metric 2: Sentiment Score
What it measures: When AI models mention your brand, is the framing positive, neutral, or negative?
How to compute: For each AI response that mentions your brand, classify the mention: positive (recommends, praises, cites positively), neutral (mentions without judgment), or negative (warns against, compares unfavorably, cites with criticism). Compute weighted score: positive=+1, neutral=0, negative=-1. Average across all mentions.
Why it matters: A high SOAV with negative sentiment is worse than a moderate SOAV with positive sentiment. Sentiment captures whether AI is actively helping or hurting your brand.
Interpretation:
- Score > +0.5: strongly positive positioning in AI responses
- Score 0 to +0.5: generally neutral to positive
- Score -0.5 to 0: mixed; investigate negative mentions
- Score < -0.5: AI is actively harming brand perception; urgent investigation needed
Common pitfall: Over-indexing on a single negative mention. Isolate the source and understand whether it's a factual issue (something to fix) or an opinion issue (something to counter-narrative).
Metric 3: Query Coverage Breadth
What it measures: How many distinct user queries trigger a brand mention, expressed as a ratio to total queries measured.
How to compute: Maintain a rotating list of 100-200 category queries (not just your target queries). Measure how many of these trigger a brand mention in AI responses. Divide by total queries.
Why it matters: Breadth captures whether you're a one-trick brand or a broad category authority. A brand mentioned in 20/100 queries is narrow (niche positioning). A brand mentioned in 80/100 is broad (category leadership).
Interpretation:
- Below 15%: narrowly positioned; relies on specific query types for visibility
- 15-40%: moderate breadth; known for some aspects of the category
- 40-70%: broad authority across the category
- Above 70%: category-defining presence
Common pitfall: Including queries that don't make business sense. If your brand is a specialty B2B SaaS, you don't need to win consumer-oriented queries. Scope breadth measurement to queries your buyers actually ask.
Metric 4: Citation Depth
What it measures: When AI models cite your brand, how central is the mention to the response?
How to compute: Score each mention on a 1-4 scale:
- 4: Primary subject of the response (the response is about your brand)
- 3: Primary recommendation among multiple options
- 2: Included in a list of multiple options
- 1: Mentioned incidentally or as context
Average the citation depth score across all mentions.
Why it matters: A brand mentioned at depth 4 in 10 responses (primary subject) generates more business impact than a brand mentioned at depth 1 in 50 responses (barely mentioned). Depth is the quality dimension that balances SOAV's quantity dimension.
Interpretation:
- Depth score > 3.0: high-impact mentions; your brand is being positioned as the answer
- Depth score 2.0-3.0: solid positioning; often recommended
- Depth score 1.0-2.0: listing without recommendation; visible but not chosen
- Depth score < 1.0: incidental mentions only; minimal business impact
Common pitfall: Inflating depth scores by running queries that essentially ask about your brand directly. "What is [your brand]?" queries will always generate depth-4 mentions — track them separately from category-neutral queries.
Metric 5: AI-Sourced Conversion Rate
What it measures: Among users who visit your website or product from AI-surfaced links, what percentage convert? How does that compare to other traffic sources?
How to compute: Set up UTM tracking for AI-platform referrals (Kimi surfaces source links, some Doubao answers surface links, etc.). Measure conversion rate of that traffic (newsletter signup, product trial, demo request, purchase).
Why it matters: Ties AI visibility to business outcomes. High AI citation with low conversion indicates a positioning or fit issue. Low AI citation with high conversion indicates room to scale. Both are actionable.
Interpretation: This metric is relative, not absolute. Benchmark AI-sourced conversion against your other traffic sources (organic search, direct, paid, social):
- AI conversion > Organic conversion: AI referrals are higher-intent; invest more in AI visibility
- AI conversion ≈ Organic conversion: parity; normal treatment
- AI conversion < Organic conversion: qualification issue; investigate intent mismatch
Common pitfall: Tiny sample sizes mislead. Don't draw conclusions from fewer than 100 AI-sourced sessions in a measurement window. Many brands see only 10-30 AI sessions initially, which is too noisy for statistical inference.
The measurement cadence
How often to measure each:
| Metric | Cadence | Effort |
|---|---|---|
| Share of AI Voice | Weekly | Automated scripts or manual sampling |
| Sentiment Score | Monthly | Automated classification + spot-check |
| Query Coverage Breadth | Monthly | Same data as SOAV, different aggregation |
| Citation Depth | Monthly | Automated classification + spot-check |
| AI-Sourced Conversion Rate | Continuous | UTM tracking + analytics platform |
Weekly on SOAV because it's the most sensitive to content publishing cadence. Monthly on the others to manage measurement overhead.
Reporting framework
For an executive dashboard, present these five as a monthly scorecard:
- SOAV: current % vs. last month, vs. 3 months ago, vs. top competitor
- Sentiment Score: current vs. last month, with any negative trends flagged
- Query Coverage: breadth %, with notable wins and losses
- Citation Depth: avg score, with depth-4 queries listed
- AI-Sourced Conversion: rate vs. organic rate, flagged if > 20% delta
This format gives executives a 5-minute read that captures both quantity and quality of AI visibility.
What to do when metrics move
SOAV drops unexpectedly. Most common causes: a competitor published a high-quality benchmark that displaces you; an algorithm update shifted platform preferences; your content has decayed (stale data, outdated examples).
Sentiment turns negative. Investigate which responses generate negative sentiment. Often it's a single wrong answer that gets quoted repeatedly — fixable by creating authoritative content that counter-narrates the misconception.
Query Coverage stalls. You've hit a breadth ceiling. Expand into adjacent query territories with new content, or refine positioning to go deeper on current breadth.
Citation Depth drops. You're being listed but not recommended. Investigate whether competitors are pushing more "best in category" claims than you are. Pure frequency of mention doesn't equal recommendation strength.
AI-Sourced Conversion drops. Usually a qualification issue — the AI is sending wrong-fit traffic. Audit the queries driving AI traffic and consider whether your content is matching intent.
Tools and approaches
There is no single best tool for AI visibility measurement in 2026. The options:
Build your own lightweight measurement. For 20-50 queries on 2-3 platforms weekly, a simple Python or Node.js script querying each platform's public interface (or API when available) is sufficient. ByteEngine's internal tooling started here.
Use ChinaRankAI's analysis platform. Designed specifically for this — automated query running across Chinese AI platforms, sentiment classification, and trend tracking. Check your brand's AI visibility.
Hybrid. Most brands with mature GEO practices combine scripted measurement for routine tracking with platform tools for deeper analytics.
Measurement checklist
- Curated query list (20-50 queries) representing business-relevant category
- Weekly SOAV measurement across target platforms
- Monthly sentiment scoring with spot-checks
- Monthly query coverage breadth tracking
- Monthly citation depth assessment
- Continuous AI-sourced traffic and conversion tracking
- Monthly scorecard format for executive reporting
- Quarterly review of query list itself (add, remove, replace)
Related reading
- Building Your First AI Rank Tracker
- 8 Content Formats Chinese AI Platforms Cite Most
- The Complete Guide to AI Search in China 2026
About ByteEngine (杭州字节引擎人工智能科技有限公司)
ByteEngine provides GEO measurement and optimization across Chinese AI platforms. Our measurement framework combines automated query running with expert sentiment and depth analysis. Learn more or check your brand's AI visibility.
