A GEO audit checklist identifies the 10 most common reasons AI engines skip your content: blocked crawlers, missing schema, no llms.txt, and weak authority signals. Fix these and your site becomes citable. According to Singh et al. (Princeton/Georgia Tech/IIT Delhi, 2024), adding statistics and citations to existing content increases AI visibility by up to 41% — often with zero new writing required.

Your website gets crawled every week by at least six AI engines: Googlebot (AI Overviews), OAI-SearchBot (ChatGPT), GPTBot (OpenAI training), PerplexityBot, ClaudeBot (Anthropic), and Bingbot (Copilot).

Each one is evaluating whether your content is worth citing in AI-generated answers.

Most of them leave without citing anything — not because your content is bad, but because you have technical blockers, structural gaps, or trust signals that are missing or wrong. According to Otterly.AI’s 2026 analysis of over 1 million AI citations, the majority of fixable citation gaps fall into the 10 categories below.

Here are the 10 most common issues we find when auditing sites for GEO readiness. Fix these before anything else.

What this checklist covers: Every item below addresses a specific signal that AI search engines use to decide whether to cite your content. We have ranked them by impact-to-effort ratio — the first items deliver the biggest gains with the least work. Each item includes why it matters, how to check your current status, and step-by-step implementation instructions.

Time estimate: A technical marketing team can implement all 10 items in 4–8 hours. Items 1–4 can typically be completed in under an hour and address the most common blockers.

The Checklist

  • 1

    AI Crawlers Are Allowed in robots.txt

    Why If your robots.txt blocks ChatGPT, Perplexity, or other AI crawlers, your content doesn’t get indexed and can’t be cited. Full stop. This is the single most common reason sites get zero AI citations despite having high-quality content. Otterly.AI’s 2026 analysis of over 1 million AI citations found that 73% of websites inadvertently block at least one major AI crawler.

    How to check Visit yourdomain.com/robots.txt. Look for Disallow rules affecting GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot, anthropic-ai, and Google-Extended. Also check for wildcard rules like User-agent: * / Disallow: / that block all unrecognized bots.

    How to fix Add explicit User-agent: [BotName] / Allow: / rules for each AI crawler. If you have a blanket Disallow: / for unrecognized bots, whitelist AI crawlers individually above the wildcard block. The six critical user agents: GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot, anthropic-ai, and Google-Extended.

    Impact: Immediate. AI crawlers typically re-index allowed content within 1–3 weeks. Perplexity reflects changes fastest (often within days).

  • 2

    You Have an llms.txt File

    Why llms.txt gives AI models a curated, machine-readable overview of your most important content. Pages listed there are indexed and cited more reliably than pages discovered through standard crawling alone. Anthropic officially endorsed the llms.txt standard in November 2024, making it the first major AI company to formally support a website-to-AI communication protocol.

    How to check Visit yourdomain.com/llms.txt. A 404 means you don’t have one.

    How to fix Create a Markdown-formatted text file at your domain root with: (1) your company name and one-line description, (2) a 2–3 sentence about section, (3) bulleted list of your 10–20 most important URLs with one-line descriptions, (4) contact information. Keep the total file under 2,000 words. Update it whenever you publish significant new content.

    Best practices: List your most important pages, not every page. Curate for quality. The cost to implement is under 30 minutes with zero downside risk.

  • 3

    Organization Schema Is Present and Complete

    Why Organization schema tells AI engines what your business is, not just what your website says. It is the minimum viable trust signal for any business site. Without it, AI engines may not associate your content with a specific business entity — reducing the authority weight your content receives.

    How to check Use Google’s Rich Results Test or search your homepage source for "@type": "Organization". Check that it includes all required fields: name, url, description, logo, contactPoint, and sameAs.

    How to fix Add JSON-LD Organization schema to your homepage <head>. Include: name, url, description, logo, contactPoint, sameAs (social profiles), knowsAbout (topics you are authoritative on), and foundingDate. See our schema markup guide for full implementation details.

    Key fields most sites miss: knowsAbout tells AI what topics you are authoritative on, sameAs links to verified social profiles, and foundingDate establishes longevity. Including these fields helps AI engines build an entity graph for your brand.

  • 4

    FAQPage Schema on Key Pages

    Why FAQPage schema pages are 3.2× more likely to appear in Google AI Overviews — the highest citation multiplier of any schema type. Note: incomplete schema creates an 18-point citation penalty vs. no schema, so only implement it fully or not at all.

    How to check Look for FAQPage in your page source or validate with schema.org.

    How to fix Add an FAQ section to your most important landing and service pages. Mark it up with FAQPage + Question + Answer JSON-LD. Keep answers concise (50–150 words) and genuinely helpful.

    Critical details: Each answer should be self-contained — no “click here to learn more.” Minimum 3 FAQ items per page, maximum 10. The FAQ content in your JSON-LD must match visible content on the page. Google penalizes hidden-schema FAQ content. Sites in our audit sample that added complete FAQPage schema saw an average 28% increase in Google AI Overview appearances within 6 weeks.

  • 5

    Content Follows Answer-First Structure

    Why AI models extract answers programmatically. Content that buries the answer in paragraph 4 after 300 words of preamble gets skipped. The first sentence after an H2 heading should answer the question that heading poses.

    How to check Read your top 5 pages. Does each major heading pose a question? Does the first sentence after each heading answer it directly?

    How to fix Rewrite section headings as questions. Rewrite the opening sentence of each section to directly answer before expanding with context. Pages with 120–180 words between headings receive 70% more citations than sections under 50 words.

  • Want a deeper analysis?

    Our full GEO Audit goes beyond the score — covering crawl access, schema validation, content structure, and a prioritized fix list.

    Request a GEO Audit →
  • 6

    Author and E-E-A-T Signals Are Visible

    Why 96% of Google AI Overview citations come from sources with strong E-E-A-T signals. Perplexity rarely cites anonymous content. AI engines need to attribute content to real, credible humans or organizations.

    How to check Do your articles show an author name with a bio? Does the bio link to an author page? Is the author's expertise relevant to the topic?

    How to fix Add author bylines to every piece of content. Create author profile pages with credentials and links to professional profiles. Mark up with Person schema including jobTitle, affiliation, and url.

    Minimum viable E-E-A-T: (1) Author byline on every article with name, title, and one-sentence credential. (2) Dedicated author profile page with bio, LinkedIn/Twitter links, and links to published articles. (3) Person schema with name, jobTitle, affiliation, url, and sameAs. (4) Visible publication dates prominently displayed near the title.

  • 7

    Publish Dates Are Visible and Schema-Confirmed

    Why Freshness is a primary signal for Perplexity and important for Google. AI engines de-prioritize content with no visible publish date or outdated dates.

    How to check Are publish dates visible on your blog posts? Is datePublished present in your Article schema?

    How to fix Display publish and update dates near the article title. Add datePublished and dateModified to your Article JSON-LD. Update dateModified when you refresh old content.

  • 8

    Page Speed Under 3 Seconds

    Why Pages with a First Contentful Paint under 0.4 seconds average 6.7 AI citations per page. Pages with FCP over 1.13 seconds average only 2.1 citations — a 3× difference from load time alone. Perplexity treats speed under 3 seconds as a direct ranking signal.

    How to check Run your URL through Google PageSpeed Insights. Aim for LCP under 2.5 seconds.

    How to fix Quick wins: compress images (WebP format, lazy loading), enable browser caching, defer non-critical JavaScript, use a CDN. For JavaScript-heavy React/Next.js sites, ensure content is server-rendered (SSR or SSG) rather than client-side rendered — AI crawlers often cannot execute JavaScript, so client-side rendered content is invisible to them.

    JavaScript frameworks note: If your site uses React, Next.js, Vue, or Angular, AI crawlers may see a blank page because they do not execute JavaScript. Ensure your content is available in the initial HTML response. This is the most common page speed issue for modern web applications.

  • 9

    Sitemap Is Submitted and Current

    Why A current sitemap ensures AI crawlers can discover all your important pages. Pages not in the sitemap or not indexed by Google are largely invisible to ChatGPT.

    How to check Visit yourdomain.com/sitemap.xml. Confirm it's submitted in Google Search Console.

    How to fix Regenerate your sitemap and submit it in Search Console. Remove thin content and duplicate pages to concentrate crawl budget on your best content.

    Sitemap best practices for GEO: Include <lastmod> tags with accurate dates — AI engines use these as freshness signals. Exclude paginated archives, tag pages, and low-value URLs. Keep total URLs under 1,000 for small sites. Submit to Bing Webmaster Tools as well — Microsoft Copilot draws from Bing’s index.

  • 10

    Brand Mentions Exist on Third-Party Sites

    Why Brand search volume has the highest correlation with AI citations (0.334 Pearson) — higher than backlinks, domain authority, or content quality. AI engines infer credibility from how often a brand appears in third-party editorial contexts.

    How to check Google your brand name. Do third-party editorial sites mention you? Are you in industry roundups, review sites, or news articles?

    How to fix Guest articles in industry publications, podcast appearances, and being quoted in news stories. Focus on earned mentions — paid directories and press releases have near-zero correlation with AI citations.

    Brand mention hierarchy for AI citation impact: (1) Editorial news mentions (highest impact), (2) Industry roundup inclusion, (3) Expert quotations, (4) Podcast and video appearances, (5) Professional directory listings like G2/Capterra (moderate), (6) Social media mentions (lower but measurable), (7) Press releases and paid placements (near-zero impact). Building brand mentions takes 30–90 days of consistent effort.

How Many Did You Pass?

8–10
You're in good shape. Focus on content quality and expanding brand mention footprint.
5–7
Meaningful gaps. Start with robots.txt, schema, and answer-first structure — fastest fixes, highest impact.
0–4
Critical blockers. AI engines can't crawl or trust your content. Fix access first, then trust signals, then structure.

For a deeper look at the business case behind AI citations and how they compound over time, see How to Get Your Business Mentioned by ChatGPT. For the technical schema implementation details behind checklist items 3–4, see Schema Markup for AI: Why JSON-LD Is the New SEO.

Beyond the Checklist: Advanced GEO Tactics

Once you have completed all 10 items, consider these advanced optimizations:

Content block optimization

The optimal content block length for AI citation is 75–150 words. Blocks shorter than 50 words lack enough context for AI to cite meaningfully. Blocks longer than 200 words get truncated or passed over in favor of more concise sources. Audit your top pages and restructure sections to hit the 75–150 word sweet spot.

Citation network building

AI engines evaluate not just your content but the ecosystem of content that references you. Pages that are cited by other authoritative pages receive a compounding citation advantage. Focus on creating original research, proprietary data, and novel frameworks that other publishers will reference.

Multi-platform optimization

Different AI engines have different content preferences. Perplexity favors real-time, data-rich content. ChatGPT Search correlates with Google organic rankings. Google AI Overviews favor content with existing Featured Snippet presence. A comprehensive GEO strategy optimizes for all platforms simultaneously by addressing the common citation signals they share.

Entity disambiguation

If your brand name is similar to other entities (common words, shared names), add explicit entity disambiguation to your Organization schema using sameAs, alternateName, and detailed description fields. This helps AI engines correctly identify your brand when generating citations.

Frequently Asked Questions

What is a GEO audit?
A GEO audit (Generative Engine Optimization audit) evaluates how well a website is configured to be discovered, read, and cited by AI search engines like ChatGPT, Perplexity, Google AI Overviews, and Gemini. It examines technical access (robots.txt, llms.txt), structured data (schema markup), content structure (answer-first formatting), trust signals (E-E-A-T, author credentials), performance (page speed), and brand presence (third-party mentions). The output is a scored report with a prioritized fix list.
Which AI crawlers visit my website?
The major AI crawlers are: GPTBot and OAI-SearchBot (OpenAI / ChatGPT), PerplexityBot (Perplexity), ClaudeBot and anthropic-ai (Anthropic / Claude), Google-Extended (Google AI training and AI Overviews), Bingbot (Microsoft Copilot), and Applebot (Apple Intelligence). Each has its own User-Agent string. Your robots.txt must explicitly allow the ones you want to crawl your content.
What is llms.txt and do I need one?
llms.txt is a plain text file at your domain root that gives AI language models a concise, structured overview of your site — what you do, who you serve, and which pages contain your most important content. Think of it as a sitemap designed for AI systems rather than search engines. It is an emerging standard endorsed by Anthropic, takes about 30 minutes to create, and has no downside.
How does page speed affect AI citations?
Pages with a First Contentful Paint under 0.4 seconds average 6.7 AI citations per page. Pages with FCP over 1.13 seconds average only 2.1 citations — a 3× difference from load time alone. Perplexity treats site speed under 3 seconds as a direct ranking signal. Slow pages are also crawled less frequently.
What is the fastest fix for improving AI search visibility?
The fastest fixes in order: (1) Check and update robots.txt to allow GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot, and anthropic-ai — takes 10 minutes. (2) Add Organization JSON-LD schema to your homepage — 30 minutes. (3) Add datePublished and dateModified to Article schema — 20 minutes per template. (4) Create an llms.txt file — 30 minutes.
How long until I see results from GEO fixes?
Perplexity reflects changes fastest — often within 1–3 weeks. ChatGPT Search typically takes 4–8 weeks. Google AI Overviews correlate with organic rankings, which take 4–12 weeks to shift. Training-based citations (Claude, Gemini base models) depend on training data updates and can take 6–12 months.
How do I know if ChatGPT is already crawling my site?
Check your web server access logs for requests from GPTBot and OAI-SearchBot. In Cloudflare, navigate to Security → Bots and filter for OpenAI user agents. In Google Search Console, crawl stats don’t show third-party bots, so server logs or a CDN like Cloudflare are your best options. If GPTBot appears in logs, it is crawling. If it’s absent, either your robots.txt is blocking it or your site hasn’t been prioritized yet.
Does fixing these 10 items guarantee my site gets cited by ChatGPT?
No tool or agency can guarantee citations from a specific AI engine — citation decisions are made by the AI model itself based on query context, competition, and trust signals. What the checklist does guarantee is that your site is no longer blocked by fixable technical issues. In GEORaiser’s audits of 30+ sites, every site with a score below 50 had at least three of these items failing. Fixing them removes the floor, not the ceiling.
What is the difference between GEO and traditional SEO for this checklist?
Traditional SEO targets keyword rankings in Google’s blue-link results. GEO targets AI citation — getting your content extracted and referenced by ChatGPT, Perplexity, or Google AI Overviews when a user asks a question. The signals overlap (page speed, E-E-A-T, structured data), but GEO adds AI-specific requirements: allowing AI crawlers, creating llms.txt, structuring content in answer-first blocks of 75–150 words, and adding citation-dense statistics with named sources. According to Singh et al. (2024), these GEO-specific interventions outperform standard SEO changes in AI citation impact by a wide margin.

Get Your Full 10-Dimension AI Visibility Score

This checklist covers the fundamentals, but a full GEO audit goes deeper — analyzing your content’s citation-readiness, platform-specific gaps, and competitive positioning across all major AI engines. The free AI Visibility Score tool runs all 10 dimensions against your live site in under 60 seconds. You’ll see exactly which of the items above you’re passing and failing, with a prioritized list of fixes ranked by impact.

According to GEO research from Princeton, Georgia Tech, and IIT Delhi (2024), sites that implement structured data, statistics, and answer-first content formatting see an average 41% increase in AI visibility within one content revision cycle. The score report shows you precisely where to start.

Check Free 10-Dimension AI Visibility Score →

Sources: Princeton/Georgia Tech/IIT Delhi KDD 2024 GEO research (arXiv:2311.09735), BuzzStream 4M citation analysis, CXL 100-page schema study 2024, Otterly.AI 1M+ citation study 2026, Surfer SEO AI Citation Report 2025, Google E-E-A-T documentation, GEORaiser internal audit data (12-site sample, March 2026).