How to track citations.

This is a how-to for building a minimum-viable AI citation tracker yourself. It's meant for engineers who want to understand the shape of the problem before deciding whether to build or buy. AIRRNK does all of this and more, but the DIY version is a useful reference.

Time4–8 hours to build, ongoing maintenance·DifficultyAdvanced
  1. 01

    Build a query panel

    Pick 20–50 buyer queries that your ideal customers might ask an AI. These are your test probes. Bad queries: your brand name (always yields a result, useless as signal). Good queries: 'what's the best X for Y under $Z'.

  2. 02

    Call the APIs in a clean session

    OpenAI (with browsing enabled via a web-search tool), Anthropic (Claude 4.7 with the web search tool), Perplexity (Sonar API), Google (Gemini with grounding). Run each query with temperature 0, no memory, no context. One query, one response.

  3. 03

    Parse the response

    Each response has prose + structured citation blocks. Extract URLs from citation blocks. Extract 15+ token snippets from the prose. Store both, plus the full raw response for audit.

  4. 04

    Match against your site

    For each URL, check if it's in your site's URL space (domain match). For each snippet, compute an embedding and compare against a pre-computed embedding index of your pages. Threshold around 0.88 cosine similarity for paraphrase detection.

  5. 05

    Schedule it

    Run every 6 hours. Store results in a time-series database. Variance is high — don't chase single-day swings; use 7-day rolling windows.

  6. 06

    Build the deduper and competitor tracker

    The hardest part. Near-duplicate paragraphs (model regenerations) need collapsing. Competitor citations need detection (maintain a competitor URL list, run the same matcher). This is where most DIY implementations fall over.

What to expect

A DIY tracker will cost you roughly $80–150/month in API calls for a single site, assuming 50 queries × 4 platforms × 4 runs/day. Maintenance runs 2–4 hours a week as API contracts drift. Our honest take: build it if you want to understand the shape; otherwise pay us $49/month and point the engineering time at something that compounds.

Signals · sourced
72.4%of cited pages include ≥2 question-based H2sCited-page pattern audit, 2026
+30–40%citation lift when GEO schema is correctly appliedAggarwal et al. · Princeton
42%of B2B buyer research now starts inside an LLMForrester Research, 2026

Written by

The AIRank Editorial Team

Research & editorial, AIRank

The AIRank editorial team runs the 47-point scanner, the Observer pings, and the GEO research programme every week. Writing is reviewed by the core engineers who build the Injector, Blaster, and Surgeon agents.

About the team →