Skip to content

Glossary

Terms covering the AEO/GEO domain and operator-specific decisions. One definition per headword, alphabetical.

A — Z

AEO — Answer Engine Optimisation The practice of making a web service findable and usable by AI answer engines (ChatGPT, Perplexity, Claude, Gemini). Where SEO targets search result rankings, AEO targets agent citations and recommendations.

AGENTS.md A project-level entry point file for coding agents and AI assistants. Describes what an agent should know about the project before taking action: conventions, constraints, and capability pointers. Analogous to CLAUDE.md in development environments. Emerging standard alongside llms.txt.

Agent Readiness Score A 0–100 score across five signal dimensions (Structured Data Coverage, Citability, Crawl Signal Clarity, Content Freshness, Entity Authority) that quantifies how well a web service is positioned to be found, read, and cited by AI agents. Scores fall into five bands: Critical (0–30), Developing (31–55), Functional (56–75), Strong (76–90), and Authority (91–100).

Agentic Marketing Marketing designed for AI agents as the audience rather than (or in addition to) humans. Includes: llms.txt/llms-full.txt for agent readability, MCP servers for agent actions, structured data for agent extraction, and content framing optimized for how agents synthesize rather than how humans read.

AI Overview Google's AI-generated response that appears above organic search results (formerly Search Generative Experience / SGE). Powered by a combination of Google's index and Gemini. Appearing in AI Overviews is a primary GEO target for content strategies targeting Google.

Chunking How RAG systems break content into retrievable pieces. Most systems chunk by paragraph, heading section, or token count. A 5,000-word page may produce 20+ chunks — only the relevant chunks are retrieved per query. Implication: key facts buried in long prose may never be retrieved if their chunk doesn't match the query.

Citability The quality of content that allows an agent to extract and quote a specific, verifiable claim. The mechanism is retrieval: most AI answer systems use RAG (Retrieval-Augmented Generation), which converts page content into vector embeddings and retrieves the passages most semantically similar to a user's query. Short, factually-dense sentences retrieve cleanly; marketing prose and vague paragraphs do not. Citable content contains statements agents can lift verbatim: prices, specifications, statistics, named processes, definitions. Marketing copy ("the fastest, easiest solution") is not citable. A factual sentence ("the service processes invoices in under 30 seconds") is. GEO optimisation of the Citability dimension can improve source visibility by up to 40% in RAG-backed systems.

Citation When an AI system includes a reference to a specific source in its response — either inline attribution or a source link. Citation is the primary measurable output of AEO/GEO work.

Citation Halo Effect The carry-over citation benefit from AI Overviews. Studies indicate that 35% of users who see a source cited in an AI Overview visit that source even without clicking its citation link — and 91% of AI Overview citations also appear in the top organic results. A brand that earns consistent AI citation therefore gains organic visibility as a secondary effect.

Consensus Signals The pattern AI systems read when multiple independent sources agree about an entity or claim. Generative engines weight corroboration: a claim that appears consistently across news outlets, industry publications, community forums, and directories is treated as established; a claim that exists only on the brand's own domain is not. Consensus is built through distribution — digital PR, community presence, directory consistency — and is the structural reason Entity Authority cannot be built on-page alone. Most consensus-building actions are ESCALATE territory for the operator.

Content Freshness The degree to which agents can determine that a service's content is current. Freshness signals include visible publication and update dates, accurate HTTP last-modified headers, and recent content changes. Agents discount undated or stale content when answering time-sensitive queries.

Crawl Signal Clarity The technical quality of signals that tell agents how to access, interpret, and prioritise a service's pages. Includes robots.txt configuration, XML sitemap currency, page speed, canonical tags, and Core Web Vitals. Poor crawl signal clarity means agents encounter friction or ambiguity when indexing the service — even if the content itself is excellent.

deploy manifest The operator's git-free record of what a fix pass produced. For each artifact generated: what the artifact is, where it goes, how to verify it was applied. For in-place edits: a backup of the original alongside the replacement. Enables rollback and verification without version control.

E-E-A-T Experience, Expertise, Authoritativeness, Trustworthiness. The quality rubric from Google's Search Quality Rater Guidelines, widely treated as the working model for how AI answer systems select trustworthy sources. Not a direct, measurable ranking factor — a rubric that shapes ranking systems through human quality evaluation. Trustworthiness is the anchor component; the other three support it.

Entity Any real-world thing — a business, person, product, place, concept — that can be uniquely identified and described. Agents reason about entities, not documents. A service that is a well-defined entity (with a consistent name, description, and set of properties agents can resolve) earns higher citation confidence than one that exists only as a domain name.

Entity Authority The degree to which an agent can resolve a service to a verified, cross-referenced real-world entity. High entity authority means the brand appears consistently across Wikipedia, Wikidata, authoritative directories, and press — and that these sources agree on who the entity is and what it does.

ESCALATE Operator decision: evidence is unverifiable, or the remedy requires action in the world (a Google Business Profile claim, Reddit participation, digital PR, Wikidata/Wikipedia creation, NAP edits on external platforms, Core Web Vitals dev work). Flagged with the specific blocker and what unblocks it. Never fabricated.

Factual Density The ratio of specific, verifiable facts to total content volume. High factual density = more citable content per chunk. AI systems prefer dense, specific content for citation over prose that describes rather than states.

Feed Engineering The practice of structuring product data for AI interpretation rather than browser display. Key distinction from traditional feed management: titles must match natural language queries ("Waterproof Wireless Running Headphones") rather than optimise for keyword ranking; descriptions must be attribute-dense and factual ("IPX5, 8-hour battery, 28g, Bluetooth 5.2") rather than benefit-led. Directly affects AI shopping agent retrievability. Required for ChatGPT Shopping eligibility alongside merchant registration at chatgpt.com/merchants.

FIX Operator decision: a confirmed, verifiable gap whose remedy is a text artifact the operator generates itself (JSON-LD, llms.txt, robots.txt, rewritten copy, <time> markup, sameAs additions). The operator writes the artifact — it does not hand off a spec.

GEO — Generative Engine Optimisation The practice of making a web service's content legible, citable, and authoritative to generative AI systems. GEO focuses on how agents read and interpret content — structured data, citability, freshness signals — rather than how users find it. GEO and AEO are complementary: GEO improves what agents see; AEO improves how often they see it. Note: as of early 2026, conversational AI platforms are integrating paid advertising; organic GEO and paid placement are becoming distinct channels within the same interface.

JSON-LD JavaScript Object Notation for Linked Data. The recommended format for embedding Schema.org structured data in web pages. Appears as a <script type="application/ld+json"> block in the page's HTML. Preferred over Microdata and RDFa because it does not require modifying existing HTML elements. Note: AI tools that convert pages to markdown strip <script> tags — JSON-LD is invisible to markdown-based audits (see: audit tool gap).

Knowledge Graph A structured database of entities and their relationships, maintained by search engines and AI systems. Google's Knowledge Graph powers Knowledge Panels. AI agents use similar internal graphs to resolve entity references. A service with a Knowledge Graph entry is a named entity agents can cite with confidence; a service without one is a URL agents must interpret from scratch.

llms-full.txt Extended version of llms.txt — contains the full site content in a flat, agent-readable format rather than linking to pages. Allows agents to read the entire site without crawling individual URLs. High-value for knowledge-dense sites (documentation, research, FAQs).

llms.txt A plain-text file at yourdomain.com/llms.txt designed for AI agent consumption. Contains structured information about the site — what it is, who it serves, key pages, and what agents are permitted to use. The agent-native equivalent of robots.txt. Platform support as of mid-2026: Perplexity and Anthropic's Claude.ai recognize it; ChatGPT does not yet use it for retrieval prioritization. Not yet universally adopted but increasingly recognized.

LLMO / AIO Alternative terms for GEO in use across the industry. LLMO (Large Language Model Optimisation) emphasises influence over a model's learned knowledge rather than retrieval-time visibility. AIO (Artificial Intelligence Optimisation) is a broader umbrella. Neither has an agreed definition as of early 2026. When a client uses these terms, treat them as synonyms for GEO until context indicates otherwise.

MCP — Model Context Protocol Anthropic's open standard for connecting AI models to external tools, data sources, and services. Relevant to agentic marketing: brands that expose MCP servers allow AI agents to take actions (book a table, check inventory, retrieve personalized data) rather than just retrieve information.

NAP Consistency Name, Address, Phone. The three core identifying attributes of a local or real-world business entity. Consistent NAP across all platforms (Google Business Profile, Yelp, directories, the service's own site) is a foundational entity authority signal. Inconsistencies confuse agents attempting to resolve and verify a business identity.

OAI-SearchBot ChatGPT's dedicated web crawler, used to index pages for ChatGPT's retrieval layer. Distinct from GPTBot (OpenAI's general training crawler). A robots.txt directive that allows GPTBot does not automatically allow OAI-SearchBot — both must be explicitly permitted. A site that permits GPTBot but blocks OAI-SearchBot will not appear in ChatGPT search results.

Organic GEO vs Paid AI Placement As of 2026, major AI platforms are integrating paid advertising alongside organic citations. Organic GEO = winning citations through content quality, entity authority, and technical signals. Paid AI placement = appearing in AI responses through advertising spend. These are distinct channels with different strategies. When a competitor appears in AI answers despite weak signals, paid placement is a plausible explanation.

Perplexity Bot Perplexity's crawler. Aggressive crawl frequency relative to other AI crawlers. Content indexed by PerplexityBot is available for real-time citation in Perplexity responses. Blocking it removes content from Perplexity's citation pool.

RAG — Retrieval-Augmented Generation The architecture used by most AI answer systems (ChatGPT, Perplexity, Gemini, and others) to generate grounded responses. When a user submits a query, the system converts it to a vector embedding, retrieves the most semantically similar passages from an indexed knowledge base, and feeds those passages to the LLM as context before generating the response. A web service that is indexed, structured, and factually dense gets retrieved more reliably than one with thin or vague content. This is the technical reason Citability and Structured Data Coverage directly affect whether a service appears in AI-generated answers.

sameAs A Schema.org property that links a web entity to its representation on external authoritative sources (Wikipedia, Wikidata, Google Business Profile, etc.). Including sameAs in structured data tells agents: "this entity and that entity are the same thing." It is the primary mechanism for building entity authority programmatically.

Schema Markup Structured data vocabulary (schema.org) embedded in page HTML that explicitly labels content for machines. Types relevant to AEO: Organization, LocalBusiness, FAQPage, HowTo, Product, Review, Article, Speakable. Correct schema type selection matters — FAQPage markup on content that isn't actually FAQ format produces no signal lift.

Schema.org The shared vocabulary for structured data on the web, maintained by Google, Microsoft, Yahoo, and Yandex. Schema.org defines entity types (SoftwareApplication, LocalBusiness, Product, Person, FAQPage, etc.) and their properties. When a page uses Schema.org correctly, agents can read it the way a human reads a structured form rather than a paragraph.

SEO — Search Engine Optimisation The practice of optimising content and sites to rank in traditional search engine results (Google, Bing) and earn clicks from result listings. Adjacent to but distinct from the operator's domain: SEO targets rank position and click-through; AEO/GEO target citation and inclusion in AI-generated answers. Technical SEO fundamentals (crawl access, canonical tags, page speed) are the shared floor — the operator audits them through the agent-access lens as Crawl Signal Clarity, never through the ranking lens. Boundary reference: seo-aeo-geo-distinctions.md.

SHIP Operator decision: a dimension scores high enough (≥16/20, or overall band ≥ Strong) — no action taken, no padding. The operator does not generate fixes for dimensions that meet the threshold. A dimension below 16 also SHIPs when no confirmed gap can be closed with a producible artifact this pass — the remaining lift is already escalated or requires no artifact; the route is stated as “SHIP (no producible gap)” with the reason named.

Signal Map A plain-language description of how an AI agent currently sees a web service, structured by the five Agent Readiness dimensions. The Signal Map is the diagnostic layer — it explains the score, not just the number.

skill.md A declarative file placed at the root of a web service or documentation site that maps agent-callable capabilities: what the service can do, what inputs it accepts, what constraints apply, and which resources to consult. Companion to llms.txt. Relevant for services that offer API access or structured workflows agents can invoke directly, not just read.

SOM — Share of Model A measurement metric that quantifies a brand's presence within AI-generated responses as a proportion of total brand mentions in its category. Analogous to Share of Voice in traditional media. Measured by running a set of representative queries against one or more LLMs, recording which brands appear and their position, and tallying mentions over time. SOM is the ongoing measurement layer above the Agent Readiness Score: the score diagnoses structural readiness at a point in time; SOM tracks whether optimisation is working in production.

Source Diversity Threshold The empirically observed threshold (~250 external sources) at which AI models begin treating a brand as a known entity — one they cite with high confidence rather than hedging or omitting. Below this threshold, AI citations are opportunistic (the brand's own content was retrieved) rather than authoritative (the model has internalised the brand as a real entity). Source diversity is built through consistent brand descriptions across Tier 1 sources (Wikipedia, Wikidata, Crunchbase), community platforms (Reddit: 21–40% of AI Overview citations), trade press, and directories.

Structured Data Machine-readable markup embedded in a web page that tells agents what entities the page contains and how they relate. The dominant format is JSON-LD using Schema.org vocabulary. Without structured data, agents parse raw text and infer entity types — a process that introduces errors and reduces citation confidence.

Topic Cluster (Hub-and-Spoke) A content architecture: one comprehensive pillar page (hub) covering a topic broadly, linked to and from narrower cluster pages (spokes) that each resolve a specific intent. The dense internal link graph demonstrates topical depth that both search engines and generative systems read as authority. Relevant to the operator as context: a single audited page's citability partly depends on the cluster around it — a structural factor the operator can name in the Signal Map but cannot fix in a one-page pass.

Universal Commerce Protocol (UCP) An open standard, launched January 2026 (founding partners: Shopify, Target, Walmart), enabling AI agents to discover products, construct carts, and complete purchases across participating platforms. Operates across four layers: Discovery, Cart, Checkout, and Post-purchase. Implementation pathways: API integration, Agent-to-Agent (A2A), or Model Context Protocol (MCP). Relevant to AEO for e-commerce services — product data that is not structured for natural language queries is invisible to AI shopping agents regardless of UCP adoption status.

Vector Similarity How RAG systems find relevant content. Content is converted into numerical embeddings; when a query arrives, its embedding is compared to stored content embeddings. Closest matches are retrieved. Implication: content that uses the same language as likely queries retrieves better than content that paraphrases.

Zero-Click Search A search that ends without a click to any website — the answer is consumed directly from the results surface (featured snippet, AI Overview, AI chat response). Practitioner estimates put zero-click at 59–65% of searches as of 2025–2026 (T3–T4, directional; sources conflict — see seo-aeo-geo-distinctions.md). Zero-click is the economic backdrop of AEO/GEO: when the click disappears, being the cited source inside the answer is the remaining visibility.