Webpage AI Token Counter Tool
Count the tokens it would take for different AI models to consume the content on your webpage. Understanding how AI LLMs consume and repurpose web content could be the future to AEO (agentic engine optimisation) and growth in LLM answers.
Why Token Counting Matters for SEO & AEO
Agentic Engine Optimisation is about formatting your website content to efficiently deliver it to AI agents, who "consume" your content, provided they aren't blocked in your robots.txt file and their IP addresses aren't also blocked.
Context Window Efficiency
AI models have a limited "context window", up to 30,000 tokens according to some sources. If your page is bloated with unnecessary boilerplate content hidden behind interactive elements or relies heavily on JavaScript to render content, the model might truncate your most important content, lose focus on your key messages or miss the unique value proposition of your page.
Crawl Budget & Cost
AI agents (like Perplexity or GPT-Search) incur costs based on tokens. Pages that are tokenised and optimised are faster to parse and more likely to be fully utilised in an AI's synthesised answer.
So what does this mean? Well, effectively, making your pages "lighter" and easier to consume will help to ensure; more of your key content is processed by LLMs and therefore, more of your value proposition is included in AI LLM answers about your brand.
Outputs
You'll get a few outputs after you've entered your URL. Here's a description of each to help you understand the tool and the information it returns.
Character, Word and Average Token Estimate
Simple metrics to help you understand the word count, character count and average number of tokens consumed across 4 of the globe's top AI models.
Token Estimates by Model
Specific estimates for each model, including;
- OpenAI (GPT-5.4)
- Anthropic (Claude Sonnet/Opus)
- Google Gemini (3.1 Pro/Flash)
- DeepSeek
Extracted Content (Raw Text Preview)
This is the raw text only, outputted from your page. Useful to understand exactly which elements of the page have been consumed and assessed.
Markdown Version (for llms.txt)
Formatted for AI agents and something you can paste directly into your LLMs.txt, should you wish to create one.
Markdown is the language of AI LLM agents, so you can use features on your website, like LLMs.txt and skills.md to tell them directly:
- more about what the content on your webpage covers
- explain in detail, what services, tools and products your site offers.
How we calculate these counts
While exact tokenisation requires model-specific WASM libraries, this tool uses refined heuristics based on typical English language usage:
- OpenAI: 1 token ≈ 4 characters. Based on cl100k_base / o200k_base tokeniser.
- Claude: 1 token ≈ 3.5 characters. Anthropic's tokeniser is often slightly more dense for complex prose.
- Gemini: 1 token ≈ 4 characters. Google's models typically align with the standard 1:4 ratio for web content.
Note: These are estimates. Exact counts may vary slightly based on non-English characters, code blocks, and specific model versions.