What counts as "AI traffic" on PeerPush?

AI traffic combines three categories: AI crawlers operated by model providers, AI agents browsing on behalf of a user via a chat assistant, and MCP clients making explicit tool calls into PeerPush. Search-engine and analytics bots are categorized separately and excluded.

How fresh is the Signals data?

Aggregations refresh on rolling schedules: short windows update every few minutes and longer windows on slower schedules. The "Data fresh as of" timestamp on each page reflects the slowest aggregation feeding that page.

How does PeerPush protect privacy in Signals?

IP addresses are never stored, only a salted hash. Query strings, search terms, and prompt text pass through a PII redactor before publication, and anonymous-visitor identifiers rotate daily so the same browser is unlinkable across UTC days.

Back to Signals

Methodology

How we measure AI attention on PeerPush, what we hide, and when we decide there is enough data to publish a number.

What counts as “AI traffic”

We classify every request hitting PeerPush by user-agent and entry point. “AI traffic” combines three categories:

AI crawlers: training crawlers operated by model providers (e.g., GPTBot, ClaudeBot, PerplexityBot, Google-Extended).
AI agents: browsing on behalf of a user via a chat assistant (e.g., ChatGPT-User, Claude-User).
MCP clients: explicit tool calls into PeerPush's Model Context Protocol endpoint.

Search-engine crawlers (Googlebot, Bingbot) and analytics bots (Ahrefs, Semrush) are excluded; they are categorized separately and not part of the “AI” share.

Why we don't show absolute counts (yet)

On a directory the size of PeerPush, raw daily counts can fluctuate enough that a single day-over-day comparison is meaningless. We expose shape: share %, ordinal rank, and percentage change vs the prior window, until volume is high enough that absolute numbers tell a stable story. The hero share bar appears only after the chosen window has accumulated at least 50 distinct visitors; below that floor the page shows a “Calibrating” placeholder instead.

K-anonymity floor

Public Signals surfaces never expose a named subject (a specific product, agent, category, or country) backed by fewer than 5 distinct visitors. Below the floor, the row is suppressed.

Privacy and redaction

IP addresses are never stored. Only a salted hash is retained.
Query strings, search terms, and prompt text pass through a PII redactor before publication. The redactor strips email addresses, phone numbers, OAuth-token shapes, and credit-card-shaped digit sequences.
Anonymous-visitor identifiers rotate daily, so the same anonymous browser is intentionally unlinkable across UTC days.

How fresh is the data

Aggregations refresh on rolling schedules: short windows update every few minutes, longer windows on slower schedules. The “Data fresh as of…” timestamp on each Signals page reflects the slowest aggregation still feeding that page.

How each panel is computed

Rising challengers

Each product's engagement combines product-page views, external-website clicks from product cards, alternative-page outbound clicks, and MCP citation impressions in the chosen window. Products are ranked by the percentage change versus the prior equal-length window, with a small minimum-volume floor in both windows.

AI crawler activity

Top crawlers visiting PeerPush in the chosen window, ordered by total event count, with the same minimum-volume floor.

AI Surface Area (owner dashboard)

On private owner dashboards, AI Surface Area decomposes into four streams: AI crawlers indexing the product, traffic landing from AI chat assistants, programmatic API consumers, and MCP tool calls. Click-through events to external sites are deliberately excluded so the metric reflects discovery exposure, not downstream funnel behavior. Tiles are unweighted: the total is the simple sum of the four streams.

Hottest searches

Top normalized search-bar queries in the last 30 days, ranked by total searches. Queries are lowercased and trimmed before grouping so capitalisation and whitespace don't fragment the leaderboard. Two layers of privacy protection apply: a query has to come from at least 5 distinct visitors before it surfaces, and every query string passes through the PII redactor before it reaches this page.

Hidden gems

Products with strong AI signal but low human visibility yet, in the last 30 days. Human visibility is a composite of upvotes, follows, and average rating. The thresholds re-calibrate as the catalog grows; the panel stays empty until the candidate pool is large enough to make percentile cuts meaningful.

Alternative lens (Switcher pressure)

On the per-alternative lens page, switcher pressure is the share of alt-page visitors who click through to a PeerPush challenger product. The page shows the top 3 challengers only; the full graph and verbatim queries are visible only on the verified owner's private dashboard.

Unmet demand

Three complementary signals over the last 30 days. PeerPush search is hybrid (pgvector cosine similarity combined with full-text search via reciprocal rank fusion). Even when the catalog has nothing genuinely close, hybrid search returns weakly-related neighbors - so a zero-result count is rare and not a useful signal. We capture three different angles instead.

Weak match: queries where the top result's semantic cosine similarity is below 0.55 (configurable). Suggests the catalog has nothing genuinely close. Captured at insert time from the hybrid search path; the simpler full-text fallback path is excluded.
Low CTR: queries with at least 50 impressions where the click-through rate is below 5%. Either the ranking is wrong, the taglines mislead, or the right product doesn't exist.
Re-search: queries followed by a different query from the same anonymous visitor within 60 seconds. Strong behavioural “I didn't find it” signal. Within-day actor stability only - cross-day re-search is intentionally invisible.

Every tab applies the k-anonymity floor of 5 distinct visitors before a query surfaces, and every query string passes through the PII redactor.

AI citation freshness (owner dashboard)

On private owner dashboards, we track when each of the tracked AI agent buckets (Claude, ChatGPT, Perplexity, Gemini, Copilot) last cited the product. Severity bands: fresh under 14 days, aging 14-30 days, stale beyond 30 days, never cited otherwise. Buckets the product has not been cited by are surfaced explicitly so the gap is visible. “Cited” means at least one MCP result impression for the product attributed to an agent in the bucket.

AI coverage gap

For each of the top 30 categories (by total events in the last 30 days) and each tracked AI agent bucket (Claude, ChatGPT, Perplexity, Gemini, Copilot), we compute the share of approved + published Products in the category that have been cited at least once by that agent in the last 30 days. Categories with fewer than 10 approved products are excluded so single-product percentages don't skew the matrix. The panel is suppressed until total MCP citation volume crosses the privacy floor.

AI vs human discovery mix shift

A 90-day stacked view of where product-discovery events come from, with a 7-day headline showing the current AI share and the change vs the prior 7 days. Discovery events are page views, list impressions, search impressions, external clicks, MCP result impressions, and API result impressions. Each event is bucketed once: MCP tool calls win first, then AI crawlers, then AI chat referrers, then search-engine referrers, then social, then direct, else “other.” The strip is suppressed and replaced with a calibrating placeholder until the headline window has at least 50 distinct visitors.

Retention

Raw event rows are kept for 540 days, then dropped by an automated retention policy. Aggregated daily, weekly, and monthly rollups (which contain no per-visitor data) are kept indefinitely.

Methodology v2, last updated May 2026.