When we started building Ataiva, the first question was: how do you even measure AI visibility? There's no equivalent of Google Search Console for ChatGPT. No rank tracker. No click-through rate data. We had to build the measurement framework from scratch.

After a year of tracking 1,000+ brands across four AI engines, here are the metrics we've settled on — and why each one matters.

Metric 1: Mention Rate

This is the most fundamental metric. For a given set of relevant prompts, what percentage result in your brand being mentioned?

We calculate it by running a standardized set of prompts relevant to your category across all four engines. If you're a CRM, we run 50 CRM-related prompts. If you're mentioned in 15 of the 200 total responses (50 prompts × 4 engines), your mention rate is 7.5%.

Benchmarks from our data:

  • Category leaders: 60-80% mention rate
  • Strong challengers: 20-40%
  • Visible but not dominant: 5-20%
  • Effectively invisible: below 5%

Most brands we onboard start below 5%. That's not a failure — it's a baseline. The important thing is tracking how it changes over time as you implement AI visibility strategies.

Metric 2: Sentiment Score

Being mentioned isn't enough. How you're mentioned matters enormously.

We classify every mention as positive, neutral, or negative based on the framing. "HubSpot is a popular and well-regarded CRM" is positive. "HubSpot is a CRM platform" is neutral. "HubSpot can be expensive for small teams and has a steep learning curve" is negative.

Your sentiment score is the ratio: (positive mentions - negative mentions) / total mentions, scaled to -100 to +100.

A brand with a high mention rate but negative sentiment has a bigger problem than a brand with a low mention rate. Negative AI mentions actively drive customers away. We've seen brands where ChatGPT consistently frames them as "expensive" or "complex" — even when that's not accurate — and it directly impacts their pipeline.

If your sentiment is negative, the fix is usually about correcting the narrative on third-party sources. AI engines pick up sentiment from reviews, Reddit discussions, and comparison articles. If those sources frame you negatively, the AI will too.

Metric 3: Citation Coverage

When AI engines mention your brand, do they cite your website as a source? This matters for two reasons: it drives direct traffic, and it signals that the AI considers your content authoritative.

Citation coverage = (responses that cite your domain) / (responses that mention your brand).

Perplexity always cites sources, so it's the easiest to track there. ChatGPT with web browsing cites sources sometimes. Claude and Gemini are less consistent about citations.

High citation coverage means AI engines are using your content as a reference, not just mentioning your brand name in passing. It means you're a source of truth, not just a data point. Brands with citation coverage above 40% typically see measurable referral traffic from AI engines.

Metric 4: Engine Breadth

How many of the four major engines mention you? This is simpler than it sounds but critically important.

We score it as: number of engines where your mention rate exceeds 5%, divided by 4.

A brand that's visible on ChatGPT but invisible on Claude, Gemini, and Perplexity has an engine breadth of 25%. That's risky — you're dependent on a single engine, and user preferences are shifting constantly.

We see a lot of brands that optimize for ChatGPT (because it's the biggest) and ignore the others. That's a mistake. Claude is growing fast in professional contexts. Gemini is integrated into Google's ecosystem. Perplexity is the default for research-heavy queries. You need presence across all of them.

Metric 5: Accuracy Rate

What percentage of factual claims AI engines make about your brand are correct? This covers pricing, features, positioning, founding date, team size — anything verifiable.

We check this by comparing AI responses against your actual product information. An accuracy rate below 70% means AI is actively spreading misinformation about your brand. We see this more often than you'd think — the average accuracy rate across our index is 63%.

Low accuracy is fixable. Structured data, an llms.txt file, and consistent information across third-party sources all improve accuracy. Brands that implement all three typically see accuracy jump from ~60% to ~85% within 6-8 weeks.

The Ataiva Visibility Score

We combine these five metrics into a single 0-100 score. The weighting:

  • Mention Rate: 30%
  • Sentiment: 25%
  • Citation Coverage: 20%
  • Engine Breadth: 15%
  • Accuracy: 10%

Mention rate gets the highest weight because it's the most fundamental — if you're not being mentioned, nothing else matters. Sentiment is second because negative mentions are worse than no mentions. Accuracy gets the lowest weight because it's the most fixable.

You can see your score for free at our visibility report tool. It runs in about 60 seconds and gives you a breakdown across all five metrics and all four engines.

The score isn't the point. The point is having a consistent, repeatable way to measure progress. Run it today, implement changes, run it again in a month. That delta is what matters.