← Back to Blogs
HN Story

Quantifying the Cost of AI Crawlers: An Introduction to BotCost.dev

May 11, 2026

Quantifying the Cost of AI Crawlers: An Introduction to BotCost.dev

The rise of Large Language Models (LLMs) has led to an increase in aggressive web crawling. Bots from companies like OpenAI (GPTBot), Anthropic (ClaudeBot), and Perplexity are now traversing the web 24/7 to ingest data for training and real-time retrieval. For many website owners, this results in a significant increase in bandwidth consumption and server load without any direct benefit to the site owner.

The Hidden Cost of AI Scraping

While many developers view bot traffic as a background noise, the cumulative effect can be substantial. According to data provided by BotCost.dev, approximately 38% of all web traffic is now non-human. For a typical site with 50,000 visitors per month, the estimated monthly bandwidth cost from AI scrapers alone can reach $180.

This "invisible" cost is particularly burdensome for sites hosted on platforms with strict bandwidth quotas or those paying for egress traffic. When AI bots crawl a site frequently, they can eat through resources that should be reserved for actual human users, potentially impacting site performance and increasing operational overhead.

How BotCost.dev Works

BotCost.dev is a free analyzer designed to help site owners quantify this impact. The tool operates on a privacy-first principle: users upload their server logs (from Nginx, Apache, Cloudflare, or Vercel), but the processing happens entirely within the browser. No data is uploaded to a remote server, ensuring that sensitive log data remains private.

The Analysis Process

  1. Log Ingestion: The tool accepts common log formats, including .log, .csv, and .json files.
  2. Fingerprinting: The analyzer matches requests against 18 known AI bot fingerprints. This includes major players like GPTBot, ClaudeBot, Bytespider, and CCBot.
  3. Cost Calculation: By analyzing the volume of data transferred to these bots, the tool calculates the real-world bandwidth cost in dollars.
  4. Mitigation: Once the cost is identified, the tool generates ready-to-paste WAF (Web Application Firewall) rules for Cloudflare, Nginx, or Next.js, as well as robots.txt configurations to block these bots.

Practical Insights and Limitations

Community feedback from Hacker News highlights an important distinction in non-human traffic. While BotCost.dev focuses on AI bots, not all non-human traffic is AI-driven. As one user (@nottorp) noted after running the tool on a personal landing page:

"0% of traffic is AI bots. 99% of traffic is vulnerability scanners actually."

This serves as a reminder that while AI crawlers are a new and significant source of bandwidth drain, they remain part of a larger ecosystem of automated traffic, including search engine indexers, vulnerability scanners, and malicious actors.

Conclusion

As the web becomes more data-hungry for AI training, the ability to distinguish between beneficial traffic and resource-draining scrapers becomes critical. Tools like BotCost.dev provide the necessary visibility to move from guessing to quantifying the exact financial impact of AI bots on your infrastructure, allowing site owners to make informed decisions about whether to allow or block these agents.

References

HN Stories