Your website gets crawled every week by at least six AI engines: Googlebot (AI Overviews), OAI-SearchBot (ChatGPT), GPTBot (OpenAI training), PerplexityBot, ClaudeBot (Anthropic), and Bingbot (Copilot).

Each one is evaluating whether your content is worth citing in AI-generated answers.

Most of them leave without citing anything — not because your content is bad, but because you have technical blockers, structural gaps, or trust signals that are missing or wrong.

Here are the 10 most common issues we find when auditing sites for GEO readiness. Fix these before anything else.

The Checklist

  • 1

    AI Crawlers Are Allowed in robots.txt

    Why If your robots.txt blocks ChatGPT, Perplexity, or other AI crawlers, your content doesn't get indexed and can't be cited. Full stop.

    How to check Visit yourdomain.com/robots.txt. Look for Disallow rules affecting GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot, anthropic-ai, and Google-Extended.

    How to fix Add explicit Allow: / rules for each bot you want to permit. If you have a blanket Disallow: / for unrecognized bots, whitelist these individually.

  • 2

    You Have an llms.txt File

    Why llms.txt gives AI models a curated, machine-readable overview of your most important content. Pages listed there are indexed and cited more reliably than pages discovered through standard crawling.

    How to check Visit yourdomain.com/llms.txt. A 404 means you don't have one.

    How to fix Create a markdown-formatted text file at your domain root with your company description, key pages, and services. Takes about 30 minutes and is essentially zero-risk.

  • 3

    Organization Schema Is Present and Complete

    Why Organization schema tells AI engines what your business is, not just what your website says. It's the minimum viable trust signal for any business site.

    How to check Use Google's Rich Results Test or search your homepage source for @type": "Organization".

    How to fix Add JSON-LD Organization schema to your homepage <head>. Include: name, url, description, logo, contactPoint, and sameAs (your social profiles). See our schema markup guide for full implementation details.

  • 4

    FAQPage Schema on Key Pages

    Why FAQPage schema pages are 3.2× more likely to appear in Google AI Overviews — the highest citation multiplier of any schema type. Note: incomplete schema creates an 18-point citation penalty vs. no schema, so only implement it fully or not at all.

    How to check Look for FAQPage in your page source or validate with schema.org.

    How to fix Add an FAQ section to your most important landing and service pages. Mark it up with FAQPage + Question + Answer JSON-LD. Keep answers concise (50–150 words) and genuinely helpful.

  • 5

    Content Follows Answer-First Structure

    Why AI models extract answers programmatically. Content that buries the answer in paragraph 4 after 300 words of preamble gets skipped. The first sentence after an H2 heading should answer the question that heading poses.

    How to check Read your top 5 pages. Does each major heading pose a question? Does the first sentence after each heading answer it directly?

    How to fix Rewrite section headings as questions. Rewrite the opening sentence of each section to directly answer before expanding with context. Pages with 120–180 words between headings receive 70% more citations than sections under 50 words.

  • 6

    Author and E-E-A-T Signals Are Visible

    Why 96% of Google AI Overview citations come from sources with strong E-E-A-T signals. Perplexity rarely cites anonymous content. AI engines need to attribute content to real, credible humans or organizations.

    How to check Do your articles show an author name with a bio? Does the bio link to an author page? Is the author's expertise relevant to the topic?

    How to fix Add author bylines to every piece of content. Create author profile pages with credentials and links to professional profiles. Mark up with Person schema including jobTitle, affiliation, and url.

  • 7

    Publish Dates Are Visible and Schema-Confirmed

    Why Freshness is a primary signal for Perplexity and important for Google. AI engines de-prioritize content with no visible publish date or outdated dates.

    How to check Are publish dates visible on your blog posts? Is datePublished present in your Article schema?

    How to fix Display publish and update dates near the article title. Add datePublished and dateModified to your Article JSON-LD. Update dateModified when you refresh old content.

  • 8

    Page Speed Under 3 Seconds

    Why Pages with a First Contentful Paint under 0.4 seconds average 6.7 AI citations per page. Pages with FCP over 1.13 seconds average only 2.1 citations — a 3× difference from load time alone. Perplexity treats speed under 3 seconds as a direct ranking signal.

    How to check Run your URL through Google PageSpeed Insights. Aim for LCP under 2.5 seconds.

    How to fix Quick wins: compress images, enable caching, defer non-critical JavaScript, use a CDN. For JavaScript-heavy React/Next.js sites, ensure content is server-rendered rather than client-side.

  • 9

    Sitemap Is Submitted and Current

    Why A current sitemap ensures AI crawlers can discover all your important pages. Pages not in the sitemap or not indexed by Google are largely invisible to ChatGPT.

    How to check Visit yourdomain.com/sitemap.xml. Confirm it's submitted in Google Search Console.

    How to fix Regenerate your sitemap and submit it in Search Console. Remove thin content and duplicate pages to concentrate crawl budget on your best content.

  • 10

    Brand Mentions Exist on Third-Party Sites

    Why Brand search volume has the highest correlation with AI citations (0.334 Pearson) — higher than backlinks, domain authority, or content quality. AI engines infer credibility from how often a brand appears in third-party editorial contexts.

    How to check Google your brand name. Do third-party editorial sites mention you? Are you in industry roundups, review sites, or news articles?

    How to fix Guest articles in industry publications, podcast appearances, and being quoted in news stories. Focus on earned mentions — paid directories and press releases have near-zero correlation with AI citations.

How Many Did You Pass?

8–10
You're in good shape. Focus on content quality and expanding brand mention footprint.
5–7
Meaningful gaps. Start with robots.txt, schema, and answer-first structure — fastest fixes, highest impact.
0–4
Critical blockers. AI engines can't crawl or trust your content. Fix access first, then trust signals, then structure.

For a deeper look at the business case behind AI citations and how they compound over time, see How to Get Your Business Mentioned by ChatGPT. For the technical schema implementation details behind checklist items 3–4, see Schema Markup for AI: Why JSON-LD Is the New SEO.

Get Your Full GEO Score

This checklist covers the fundamentals, but a full GEO audit goes deeper — analyzing your content's citation-readiness, platform-specific gaps, and competitive positioning across all major AI engines.

Run Free GEO Audit →

GEORaiser audits websites for AI search readiness. Statistics from Princeton/Georgia Tech (KDD 2024 GEO research), BuzzStream (4M citation analysis), and Google E-E-A-T documentation.