Your website gets crawled every week by at least six AI engines: Googlebot (AI Overviews), OAI-SearchBot (ChatGPT), GPTBot (OpenAI training), PerplexityBot, ClaudeBot (Anthropic), and Bingbot (Copilot).
Each one is evaluating whether your content is worth citing in AI-generated answers.
Most of them leave without citing anything — not because your content is bad, but because you have technical blockers, structural gaps, or trust signals that are missing or wrong.
Here are the 10 most common issues we find when auditing sites for GEO readiness. Fix these before anything else.
The Checklist
-
AI Crawlers Are Allowed in robots.txt
Why If your
robots.txtblocks ChatGPT, Perplexity, or other AI crawlers, your content doesn't get indexed and can't be cited. Full stop.How to check Visit
yourdomain.com/robots.txt. Look forDisallowrules affectingGPTBot,OAI-SearchBot,PerplexityBot,ClaudeBot,anthropic-ai, andGoogle-Extended.How to fix Add explicit
Allow: /rules for each bot you want to permit. If you have a blanketDisallow: /for unrecognized bots, whitelist these individually. -
You Have an llms.txt File
Why
llms.txtgives AI models a curated, machine-readable overview of your most important content. Pages listed there are indexed and cited more reliably than pages discovered through standard crawling.How to check Visit
yourdomain.com/llms.txt. A 404 means you don't have one.How to fix Create a markdown-formatted text file at your domain root with your company description, key pages, and services. Takes about 30 minutes and is essentially zero-risk.
-
Organization Schema Is Present and Complete
Why Organization schema tells AI engines what your business is, not just what your website says. It's the minimum viable trust signal for any business site.
How to check Use Google's Rich Results Test or search your homepage source for
@type": "Organization".How to fix Add JSON-LD Organization schema to your homepage
<head>. Include:name,url,description,logo,contactPoint, andsameAs(your social profiles). See our schema markup guide for full implementation details. -
FAQPage Schema on Key Pages
Why FAQPage schema pages are 3.2× more likely to appear in Google AI Overviews — the highest citation multiplier of any schema type. Note: incomplete schema creates an 18-point citation penalty vs. no schema, so only implement it fully or not at all.
How to check Look for
FAQPagein your page source or validate with schema.org.How to fix Add an FAQ section to your most important landing and service pages. Mark it up with
FAQPage+Question+AnswerJSON-LD. Keep answers concise (50–150 words) and genuinely helpful. -
Content Follows Answer-First Structure
Why AI models extract answers programmatically. Content that buries the answer in paragraph 4 after 300 words of preamble gets skipped. The first sentence after an H2 heading should answer the question that heading poses.
How to check Read your top 5 pages. Does each major heading pose a question? Does the first sentence after each heading answer it directly?
How to fix Rewrite section headings as questions. Rewrite the opening sentence of each section to directly answer before expanding with context. Pages with 120–180 words between headings receive 70% more citations than sections under 50 words.
-
Author and E-E-A-T Signals Are Visible
Why 96% of Google AI Overview citations come from sources with strong E-E-A-T signals. Perplexity rarely cites anonymous content. AI engines need to attribute content to real, credible humans or organizations.
How to check Do your articles show an author name with a bio? Does the bio link to an author page? Is the author's expertise relevant to the topic?
How to fix Add author bylines to every piece of content. Create author profile pages with credentials and links to professional profiles. Mark up with
Personschema includingjobTitle,affiliation, andurl. -
Publish Dates Are Visible and Schema-Confirmed
Why Freshness is a primary signal for Perplexity and important for Google. AI engines de-prioritize content with no visible publish date or outdated dates.
How to check Are publish dates visible on your blog posts? Is
datePublishedpresent in your Article schema?How to fix Display publish and update dates near the article title. Add
datePublishedanddateModifiedto your Article JSON-LD. UpdatedateModifiedwhen you refresh old content. -
Page Speed Under 3 Seconds
Why Pages with a First Contentful Paint under 0.4 seconds average 6.7 AI citations per page. Pages with FCP over 1.13 seconds average only 2.1 citations — a 3× difference from load time alone. Perplexity treats speed under 3 seconds as a direct ranking signal.
How to check Run your URL through Google PageSpeed Insights. Aim for LCP under 2.5 seconds.
How to fix Quick wins: compress images, enable caching, defer non-critical JavaScript, use a CDN. For JavaScript-heavy React/Next.js sites, ensure content is server-rendered rather than client-side.
-
Sitemap Is Submitted and Current
Why A current sitemap ensures AI crawlers can discover all your important pages. Pages not in the sitemap or not indexed by Google are largely invisible to ChatGPT.
How to check Visit
yourdomain.com/sitemap.xml. Confirm it's submitted in Google Search Console.How to fix Regenerate your sitemap and submit it in Search Console. Remove thin content and duplicate pages to concentrate crawl budget on your best content.
-
Brand Mentions Exist on Third-Party Sites
Why Brand search volume has the highest correlation with AI citations (0.334 Pearson) — higher than backlinks, domain authority, or content quality. AI engines infer credibility from how often a brand appears in third-party editorial contexts.
How to check Google your brand name. Do third-party editorial sites mention you? Are you in industry roundups, review sites, or news articles?
How to fix Guest articles in industry publications, podcast appearances, and being quoted in news stories. Focus on earned mentions — paid directories and press releases have near-zero correlation with AI citations.
How Many Did You Pass?
For a deeper look at the business case behind AI citations and how they compound over time, see How to Get Your Business Mentioned by ChatGPT. For the technical schema implementation details behind checklist items 3–4, see Schema Markup for AI: Why JSON-LD Is the New SEO.
Get Your Full GEO Score
This checklist covers the fundamentals, but a full GEO audit goes deeper — analyzing your content's citation-readiness, platform-specific gaps, and competitive positioning across all major AI engines.
Run Free GEO Audit →GEORaiser audits websites for AI search readiness. Statistics from Princeton/Georgia Tech (KDD 2024 GEO research), BuzzStream (4M citation analysis), and Google E-E-A-T documentation.