Guide
Best Open-Source GEO/AEO Tools 2026
Eight open-source building blocks of a GEO/AEO audit stack, honestly ranked by license, maintenance activity, and fit with AI search visibility workflows. foglift-scan is the only GEO/AEO-native CLI on npm; the other seven are the adjacent tools every real audit reuses anyway.
The GEO/AEO category is two years old and the vendor landscape is dominated by closed dashboards. That is the normal shape of a young category: paid API access to AI engines pushes every serious competitor toward a hosted-subscription business model, and the fastest way to ship a product is to hide the scoring logic behind a login. The open-source tools that do exist are not marketed as "GEO" or "AEO" tools at all. They are general-purpose web infrastructure — performance, accessibility, structured data, security — that happen to produce the exact signals an AI search audit needs.
This guide ranks eight open-source tools a real Answer Engine Optimization audit uses in 2026. The #1 slot goes to the only open-source CLI published on npm that is purpose-built for GEO/AEO scoring — foglift-scan, which we publish under the MIT license. The remaining seven are the adjacent building blocks that have been OSS for years, are still actively maintained as of April 2026, and produce the signals every hosted GEO/AEO platform is scoring against anyway.
The case for an open-source stack is not ideology, it is auditability. A 2025 SE Ranking analysis of 129,000 domains found ChatGPT cites only 15% of the pages it retrieves, and the top 10 domains for any topic capture 46% of all citations. When the bar for being cited is that high, your AEO scoring logic needs to be something you can read, fork, and push back on — not a black-box score that changes when a vendor silently adjusts its weights.
Why open source matters in GEO/AEO specifically
- The scoring is contested. Aggarwal et al's "GEO: Generative Engine Optimization" paper (KDD 2024) introduced GEO-Bench as the first benchmark for AI-search optimization — 10,000 queries across nine categories — and showed source-level optimization lifts generative-engine citation visibility by up to 40%. A closed-source scorer has no way to show which of those levers it actually pulls; an open-source scanner publishes its heuristics in the source tree.
- Agents need to explain their reasoning. The 2025 Stack Overflow Developer Survey (n > 49,000) found 76% of developers are using or planning to use AI tools in their workflow. When an agent is calling a scanner inside an editor, the developer wants to see the rule that fired, not just the score that changed — that only works if the source is visible.
- CI gating is a compliance conversation. A 2024 Gartner projection estimates traditional search will drop 25% by the end of 2026 under AI-chatbot competition. As AEO scores move into release-gate conversations alongside Lighthouse scores and bundle-size budgets, security and legal review need to read the audit logic. Closed-source tools turn every gate debate into a vendor support ticket.
- Composability beats dashboards. A real audit pipeline chains Lighthouse, axe-core, MDN HTTP Observatory, Lychee, and foglift-scan into a single CI step. Closed dashboards do not compose; open-source CLIs do.
How we evaluated
Each tool was scored on four criteria that matter for a production GEO/AEO audit stack. Findings below were verified against each project's public GitHub repository in April 2026.
- License — permissive (MIT, Apache-2.0, MPL-2.0) or copyleft (LGPL, GPL)? Permissive licenses are friendlier for downstream commercial integration.
- Active maintenance — has a release shipped in the last six months? A stale GEO/AEO tool is a liability because the crawler-access surface keeps evolving.
- GEO/AEO signal coverage — which of the eight AEO dimensions (Structured Data Richness, Heading Clarity, FAQ Quality, Entity Identity, Content Depth, Citation Formatting, Topical Authority, AI Crawler Access) does the tool produce data for?
- CI/agent fit — does the tool expose a CLI or programmatic API that an agent or CI pipeline can call without a browser in the loop?
Quick verdict
- Best overall open-source GEO/AEO tool:
foglift-scan— the only MIT-licensed CLI on npm that is purpose-built for AI search readiness scoring, with an eight-dimension AEO scan and hosted parity to foglift.io. - Best foundation SEO/perf audit: Google Lighthouse — Apache-2.0, v13.1.0 as of April 2026, the baseline every AEO scan builds on.
- Best structured-data authoring: schema-dts — Apache-2.0 TypeScript types for Schema.org; compile-time validation of JSON-LD beats runtime validators.
- Best for accessibility (which feeds AEO): axe-core — MPL-2.0, actively released, integrated into almost every headless accessibility tool.
- Best security-headers scanner: MDN HTTP Observatory — MPL-2.0 Node.js replacement for the archived Mozilla Observatory, callable via npx for CI.
- Best for broken-link hygiene: Lychee — dual-licensed Apache-2.0 or MIT; Rust-native, fast, CI-friendly.
1. foglift-scan — Editor's Pick
foglift-scan is the CLI underneath Foglift. It is published on npm under the MIT license, and the same engine powers the hosted scan at foglift.io and the first-party MCP server used by Cursor, Claude Code, and other MCP clients. Running it locally produces the same eight-dimension AEO score the hosted product returns, against any public URL, with no account required.
This is the only open-source CLI in the GEO/AEO category with that property. Every other hosted GEO/AEO vendor at the time of writing keeps its scoring logic behind a dashboard. Our motivation for publishing the scanner open-source is the audit argument: Aggarwal et al's GEO-Bench paper made optimization a technical field with reproducible benchmarks, and a closed scorer cannot be reproduced.
What it scans
- AEO score across 8 dimensions — Structured Data Richness, Heading Clarity, FAQ Quality, Entity Identity, Content Depth, Citation Formatting, Topical Authority, AI Crawler Access
- SEO fundamentals — titles, meta descriptions, canonicals, headings, internal linking
- Security posture — HTTPS, HSTS, CSP, and other headers that AI crawlers use as a trust signal
- AI crawler access — robots.txt policy for GPTBot, ClaudeBot, PerplexityBot, and Google-Extended
- JSON output —
--jsonflag for CI pipelines and agent workflows; the MCP server reuses the same JSON contract
Pricing & availability
- CLI: MIT-licensed, installable via
npm i -g foglift-scan. Basic scans require no API key. - Free tier on foglift.io: unlimited website audits, AI action plan, PDF export, 200 monitoring tokens/month, 1 brand.
- Launch ($49/mo): daily monitoring across five AI engines, 4,000 tokens/mo, 3 brands.
- Growth ($129/mo): twice-daily monitoring, 11,500 tokens/mo, 10 brands.
- Enterprise ($299/mo): hourly monitoring, 27,000 tokens/mo, unlimited brands.
Pros
- + Only MIT-licensed CLI purpose-built for GEO/AEO scoring
- + Same engine as the hosted product — parity with foglift.io
- + Eight-dimension AEO breakdown, JSON output for CI
- + No API key required for basic URL scans
Cons
- - Citation tracking requires a Foglift API key (paid query costs apply)
- - Younger project than the decade-old tools below
Best for: agencies auditing client sites offline; developers who want an in-editor AEO scanner; CI pipelines that need a fail-the-build gate on AEO regressions; anyone who wants to see and fork the exact scoring heuristics a hosted GEO/AEO product is running.
2. Google Lighthouse
Google Lighthouse is the default entry point for any web audit. It is Apache-2.0 licensed, shipped as both a Chrome DevTools panel and a standalone Node CLI, and as of v13.1.0 (April 2026) produces scores across Performance, Accessibility, Best Practices, and SEO. The SEO category predates the AI search era, which means it audits titles, meta descriptions, canonical tags, crawlability, tap targets, and a handful of basic structured-data checks. None of that is AEO-specific, but all of it feeds AEO-adjacent signals.
Running Lighthouse alongside foglift-scan is the standard pattern. Lighthouse covers the performance and accessibility signals Google itself has spent a decade codifying; foglift-scan adds the AI-search-specific dimensions on top. The two do not overlap so running both is additive, not redundant.
Key specs
- License: Apache-2.0
- Latest release: v13.1.0 (April 2026)
- GitHub stars: ~30k
- GEO/AEO signals surfaced: performance (Core Web Vitals), SEO fundamentals, accessibility, best practices
- CLI:
npm i -g lighthouse; JSON output flag available
Best for: every audit stack. Lighthouse is the baseline. Run it first, then layer AEO-specific tools on top.
3. MDN HTTP Observatory
The MDN HTTP Observatory (github.com/mdn/mdn-http-observatory) is the Node.js replacement for the original Mozilla Observatory, which was archived in November 2024. It scores a site's HTTP security-header configuration on an A+ through F scale and exposes the same grade publicly at developer.mozilla.org/en-US/observatory.
Security headers are not a marketing story, but they are an AI crawler trust signal. Many GEO/AEO scoring systems — including Foglift's AI Crawler Access dimension — penalise sites that serve insecure headers, mis-configured CSP, or missing HSTS. Running the MDN Observatory CLI in CI is the cheapest way to make sure your security posture is not silently dragging your AEO score.
Key specs
- License: MPL-2.0
- CLI: runnable via
npx mdn-http-observatory - GEO/AEO signals surfaced: security headers (HSTS, CSP, X-Frame-Options, Referrer-Policy, Subresource Integrity)
- Successor to the archived python-based Mozilla Observatory
Best for: CI pipelines that need a grade-letter security gate; teams migrating off the archived Mozilla Observatory repo.
4. axe-core
axe-core is the accessibility-testing engine maintained by Deque Systems under the Mozilla Public License 2.0. It is the engine that Lighthouse uses for its Accessibility category, which pa11y wraps, and which nearly every headless accessibility checker on GitHub depends on. As of v4.11.3 (April 2026), axe-core ships a minor release every three to five months and supports security updates for versions up to eighteen months old.
Accessibility is not separate from AEO. Heading semantics, ARIA landmark usage, and focus order all affect how an AI crawler extracts a page into a single-paragraph answer. An axe-core scan produces the cleanest possible signal on whether your page structure is machine-readable — and the GEO side benefits from the same work.
Key specs
- License: MPL-2.0
- Latest release: v4.11.3 (April 2026)
- GEO/AEO signals surfaced: heading clarity, ARIA semantics, WCAG compliance (feeds the Heading Clarity and Content Depth AEO dimensions)
- Integration: direct library usage in Node or the browser; also underpins Lighthouse, pa11y, and dozens of other wrappers
Best for: teams that need a programmatic accessibility engine to embed in their own audit tooling, and any AEO pipeline that values the downstream heading-clarity signal.
5. web-vitals
web-vitals is Google's official client-side library for measuring Core Web Vitals (CLS, INP, LCP) and supporting metrics (FCP, TTFB) against real user sessions. It is Apache-2.0 licensed, brotli-compressed to roughly 2 KB, and is the canonical measurement surface for anything beyond a synthetic Lighthouse run. The repo is actively maintained as of v5.
Core Web Vitals are a confirmed on-page ranking signal for both classical SEO and AEO. They are also notoriously noisy when measured synthetically — Lighthouse gives you a lab number, but the only number that moves AEO scoring is the real-user one. Shipping web-vitals in your production analytics pipeline is how you get field data you can act on.
Key specs
- License: Apache-2.0
- Bundle size: ~2 KB (brotli)
- GEO/AEO signals surfaced: Core Web Vitals field measurements (CLS, INP, LCP)
- Integration: browser library; shippable as a dependency in any analytics layer
Best for: real-user measurement. Synthetic Lighthouse scores will not get you to the same ground truth.
6. schema-dts
schema-dts provides TypeScript definitions for the full Schema.org vocabulary in JSON-LD form. It is Apache-2.0 licensed, hosted under the google/ GitHub organization, and released through v2.0.0 in March 2026. The repository carries an explicit disclaimer that it is not an officially supported Google product — worth noting for enterprise procurement, but the project's release cadence is healthier than most officially-supported Google OSS projects of similar age.
Structured data is the single largest AEO signal in the Foglift scoring model, and a 2025 BrightEdge analysis of Google AI Overview appearances found sites with FAQ schema and strong structured data see up to 40% more AI Overview appearances. schema-dts catches misnamed properties, invalid type unions, and missing required fields at compile time — the class of bug that quietly costs you citations if it ships to production undetected.
Key specs
- License: Apache-2.0
- Latest release: v2.0.0 (March 2026)
- GEO/AEO signals surfaced: Structured Data Richness — compile-time JSON-LD validation against Schema.org vocabulary
- Caveat: not an officially supported Google product, despite the google/ org
Best for: TypeScript projects (Next.js, Remix, SvelteKit) authoring JSON-LD in-source. The type-checking value is substantial, and it composes with the runtime Schema.org Validator that Google itself provides at schema.org/validator.
7. Lychee
Lychee is a Rust-native broken-link checker that scans Markdown, HTML, reStructuredText, and crawled web pages for dead URLs and mail addresses. It is dual-licensed under Apache-2.0 or MIT, and ships actively — v0.24.0 released in April 2026. The async Rust engine makes it roughly an order of magnitude faster than the JavaScript alternatives for large sites, which matters when you are broken-link-checking a content hub monthly.
Broken links affect AEO scoring through two mechanisms. First, AI crawlers treat outbound links as a content-freshness and source- integrity signal; a page with four dead citations is a page AI engines will downweight for extraction. Second, inbound 404s across your own domain degrade internal PageRank-equivalent signals, which feeds Topical Authority in most GEO/AEO scoring systems. Lychee surfaces both in one pass.
Key specs
- License: Apache-2.0 OR MIT (dual)
- Latest release: v0.24.0 (April 2026)
- GEO/AEO signals surfaced: Topical Authority (internal 404s), Citation Formatting (outbound dead links)
- Language: Rust; available via cargo, Homebrew, and as a GitHub Action
Best for: content-heavy sites where stale outbound citations are a recurring problem; any CI pipeline that wants a zero-config broken-link gate.
8. pa11y
pa11y is a CLI-first accessibility-testing runner that wraps axe-core and HTML_CodeSniffer behind a uniform command-line interface. It is LGPL-3.0-only licensed, which is the one copyleft entry on this list — the LGPL is worth flagging for commercial integrators who need to avoid derivative-license exposure, though CLI invocation from a pipeline does not trigger the copyleft clause.
The reason pa11y earns a spot instead of being collapsed into axe-core is the workflow it unlocks. axe-core is a library; pa11y is a CLI and a dashboard (pa11y-dashboard) that a non-engineer on the content team can run against a URL and get a WCAG report out of, without needing to write Node code. For GEO/AEO work, the accessibility side of the audit is often owned by content or design, not engineering, and pa11y respects that split.
Key specs
- License: LGPL-3.0-only
- Latest release: v9.1.1 (February 2026)
- GEO/AEO signals surfaced: WCAG compliance levels, heading structure, ARIA errors (feeds Heading Clarity dimension)
- Integrations: CLI, Node API, and
pa11y-dashboardweb UI
Best for: content and design teams running accessibility audits without a Node stack; any pipeline that wants axe-core results behind a simpler CLI surface.
Comparison table
| Tool | License | Latest release | GEO/AEO-native? | Primary role |
|---|---|---|---|---|
| foglift-scan | MIT | v1.0.1 (2026) | Yes | AEO scoring (8 dimensions) |
| Lighthouse | Apache-2.0 | v13.1.0 (2026) | No (adjacent) | SEO, perf, a11y baseline |
| MDN HTTP Observatory | MPL-2.0 | Actively maintained | No (adjacent) | Security headers |
| axe-core | MPL-2.0 | v4.11.3 (2026) | No (adjacent) | Accessibility engine |
| web-vitals | Apache-2.0 | v5.x (2026) | No (adjacent) | Core Web Vitals RUM |
| schema-dts | Apache-2.0 | v2.0.0 (2026) | No (adjacent) | Schema.org TS types |
| Lychee | Apache-2.0 / MIT | v0.24.0 (2026) | No (adjacent) | Broken-link checker |
| pa11y | LGPL-3.0-only | v9.1.1 (2026) | No (adjacent) | Accessibility CLI |
A full open-source GEO/AEO audit in 30 seconds
The pipeline below chains five of the eight tools into a single audit script. Copy it into a shell file and point it at any URL. Everything runs locally; nothing beyond HTTP requests to the target URL is transmitted.
#!/usr/bin/env bash
# audit.sh — open-source GEO/AEO audit
# Usage: ./audit.sh https://example.com
set -euo pipefail
URL="$1"
OUT="$(mktemp -d)"
# 1. AEO scoring (the GEO/AEO-native step)
npx -y foglift-scan "$URL" --json > "$OUT/aeo.json"
# 2. SEO / performance / accessibility baseline
npx -y lighthouse "$URL" --output=json --quiet > "$OUT/lighthouse.json"
# 3. Security headers
npx -y mdn-http-observatory --host "$(echo "$URL" | awk -F/ '{print $3}')" > "$OUT/security.txt"
# 4. Broken-link sweep
npx -y lychee --format json "$URL" > "$OUT/links.json" || true
# 5. Accessibility CLI (optional, overlaps Lighthouse but deeper)
npx -y pa11y --reporter json "$URL" > "$OUT/a11y.json" || true
echo "audit artifacts: $OUT"Five tools, one script, zero vendor accounts. The only proprietary step you are still missing is citation tracking (does ChatGPT actually recommend this brand?), which requires paid AI-engine API access. That gap is the reason most teams pair the open-source audit with a hosted Foglift subscription for the monitoring side, and keep the on-page work fully in open source.
What this list deliberately excludes
- Closed-source GEO/AEO dashboards. Profound, AthenaHQ, Peec.ai, Otterly, Rankability, and Semrush AI Toolkit are covered in Best GEO/AEO Tools for Developers. They are valid tools — just not open source.
- The archived Mozilla Observatory repo (
mozilla/http-observatory) — archived November 2024; use the MDN replacement listed above. - The deprecated structured-data-testing-tool (GoogleChromeLabs). The repo is no longer reachable as of April 2026; schema-dts at compile time plus Google's Rich Results Test at runtime are the current recommended path.
- Commercial SEO tools with an open-source wrapper. A handful of Screaming Frog and Semrush automations exist on GitHub under permissive licenses, but they call back to a paid service — that is not open-source GEO/AEO in any meaningful sense.
FAQ
Is there an open-source tool that scans websites specifically for AI search readiness?
As of April 2026, foglift-scan is the only open-source CLI published on npm that is purpose-built for GEO/AEO scoring. It publishes under the MIT license, runs an eight-dimension AEO scan (Structured Data Richness, Heading Clarity, FAQ Quality, Entity Identity, Content Depth, Citation Formatting, Topical Authority, AI Crawler Access), and ships the exact scoring engine used by the hosted Foglift product. Other open-source tools in adjacent categories — Lighthouse, axe-core, MDN HTTP Observatory, schema-dts — contribute signals that feed into AEO scoring but are not AI-search-specific on their own.
Can I build a full GEO/AEO audit using only open-source tools?
Yes, with one caveat. The on-page signals that matter for AI search — structured data, heading clarity, accessibility, performance, security headers, broken links — are fully covered by the open-source stack described here. The caveat is citation tracking: measuring whether ChatGPT, Perplexity, Google AI Overview, Claude, and Gemini actually recommend your brand requires querying those engines, which means paid API access. No open-source tool ships with five-engine citation tracking today because the engines themselves are paid-tier services.
Why are most GEO/AEO tools closed source?
Two reasons. First, citation tracking requires paid API access to the AI engines, so the economic model for most vendors is a hosted subscription that bundles the query costs — open-sourcing the client does not remove the API-spend dependency. Second, the category is young: GEO-Bench (the first benchmark for generative-engine optimization, Aggarwal et al. KDD 2024, arXiv preprint November 2023) and the Model Context Protocol specification (Anthropic, November 2024) are both roughly two years old. Most vendors shipped closed dashboards in 2025 to capture the market before committing to open code.
What is the difference between foglift-scan and Google Lighthouse?
Lighthouse is a general-purpose audit tool that produces SEO, performance, accessibility, and best-practices scores. foglift-scan subsumes the Lighthouse-style SEO surface and adds an AEO score built from eight AI-search-specific dimensions — entity identity, topical authority, citation formatting, AI crawler access. Both tools are open-source MIT-compatible and both are additive — most teams run Lighthouse and foglift-scan together.
Can I run an offline GEO/AEO audit without sending data to a third party?
Partially. foglift-scan, Lighthouse, axe-core, pa11y, and Lychee run fully locally against a URL or a local server — no data leaves your machine beyond the HTTP requests to the target site. The MDN HTTP Observatory CLI can be run locally via npx. The exception is Core Web Vitals field data: web-vitals measures on real user sessions in a browser, so field measurements require deploying the library in your production site's analytics pipeline.
Is schema-dts an officially supported Google product?
No. schema-dts is published under the google/ GitHub organization and released under Apache-2.0, but the repository README includes an explicit disclaimer that it is not an officially supported Google product. For teams authoring JSON-LD in TypeScript, the type-checking value is high enough that unofficial status is a reasonable tradeoff.
What replaced Mozilla's original HTTP Observatory?
Mozilla archived the original python-based HTTP Observatory in November 2024 and recommended migrating to the Node.js-based replacement at github.com/mdn/mdn-http-observatory. The new implementation retains MPL-2.0 licensing and the A+ through F grade-letter scoring, exposes a public service at developer.mozilla.org/en-US/observatory, and ships a CLI callable via npx for CI pipelines.
Sources & Further Reading
- Aggarwal, Murahari, Rajpurohit, Kalyan, Narasimhan, Deshpande — "GEO: Generative Engine Optimization" (KDD 2024, arXiv:2311.09735). Introduces GEO-Bench (10,000 queries) and shows source-level optimization lifts generative-engine citation visibility by up to 40%.
- SE Ranking / Search Engine Journal — "Top Factors Influencing ChatGPT Citations" (2025, 129,000-domain analysis). ChatGPT cites only 15% of retrieved pages; top 10 domains take 46% of all citations in a topic.
- Stack Overflow — 2025 Developer Survey (n > 49,000 respondents). 76% of developers are using or planning to use AI tools in their development workflow.
- Gartner — "Search Engine Volume Will Drop 25% by 2026, Due to AI Chatbots and Other Virtual Agents" (February 2024).
- BrightEdge / xseek — Structured data and AI Overview analysis (2025). Sites with FAQ schema and strong structured data see up to 40% more AI Overview appearances.
- Anthropic — Model Context Protocol specification (modelcontextprotocol.io, November 2024). Defines the open interface that lets agentic tools call external servers; referenced for category-timing context.
- Mozilla Foundation — HTTP Observatory archival notice (November 2024). Source for the migration recommendation to the MDN-maintained replacement.
Fundamentals: Learn about GEO (Generative Engine Optimization) and AEO (Answer Engine Optimization) — the two frameworks for optimizing your content for AI search engines.
Related reading
Best GEO/AEO Tools for Developers 2026
The broader developer-primitives ranking across hosted platforms
Best AI Search Tools with MCP Integration 2026
Model Context Protocol support across GEO/AEO platforms
Track AI Crawler Activity
How to log GPTBot, ClaudeBot, PerplexityBot and other AI crawler requests