How Meta's Crawler Spiked My Vercel Bill with 11 Million Requests
Discover how a runaway Meta crawler triggered 11 million requests in a month and how to protect your Vercel-hosted site from unexpected infrastructure bills.

It started with a casual glance at my Vercel dashboard. I usually check it to see how my latest Next.js deployments are performing or to monitor real-time traffic for a client's Shopify headless storefront. But this time, the numbers didn't make sense.
11 million requests.
For a portfolio site and a couple of small e-commerce demos, that kind of traffic is astronomical. My first thought was a DDoS attack. My second thought was my bank account.
After digging into the access logs, the culprit wasn't a malicious hacker network or a viral Reddit post. It was Meta.
The Incident: facebookexternalhit Gone Wild
Specifically, the User-Agent strings were dominated by facebookexternalhit/1.1 and Meta-ExternalAgent. These are the crawlers Meta (Facebook/Instagram) uses to generate link previews and, increasingly, to scrape content for AI training.
While good SEO and rich link previews are essential for any business, 11 million requests in a 30-day window is not normal behavior. It was hammering every single route, asset, and API endpoint repeatedly.
The Vercel Cost Model
Here is where it hurts. Vercel, like many serverless platforms, charges based on usage. While static asset bandwidth is cheap, Edge Middleware and Serverless Function invocations are not unlimited.
If you are using Middleware for authentication, geolocation, or A/B testing (which I do heavily), every single request that hits your site triggers that middleware. Even if the bot is just pinging your homepage, your code runs, and the meter ticks.
For 11 million requests, those milliseconds add up to a significant unexpected bill.
How to Stop the Bleeding
If you find yourself in this situation, here is the triage process I used to fix it.
1. The Polite Way: robots.txt
The first line of defense is telling the bot to go away. I updated my robots.txt file to explicitly disallow these agents:
User-agent: facebookexternalhit
User-agent: Meta-ExternalAgent
Disallow: /The Problem: This relies on the bot being polite. While Meta usually respects this, propagation takes time, and during a glitch (which this seemed to be), they might ignore it completely.
2. The Application Layer: Blocking in Middleware
Since Vercel's Edge Middleware was the cost center, I added logic to reject these User-Agents immediately.
// middleware.ts
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';
export function middleware(request: NextRequest) {
const userAgent = request.headers.get('user-agent') || '';
if (userAgent.includes('facebookexternalhit') || userAgent.includes('Meta-ExternalAgent')) {
return new NextResponse(null, { status: 403 });
}
return NextResponse.next();
}The Catch: This still counts as an invocation! You are still paying Vercel to say "No" to Meta. It's cheaper than rendering the full page, but at 11 million hits, it's still not free.
3. The Infrastructure Layer: Vercel Firewall & Cloudflare
This is the real solution. You need to block the traffic before it executes your code.
- Vercel Firewall: Vercel has a built-in firewall where you can block traffic by User-Agent or IP. This stops the request at the edge before it hits your functions.
- Cloudflare (Recommended): I sit all my projects behind Cloudflare. By creating a WAF (Web Application Firewall) rule to challenge or block requests containing specific User-Agents, the traffic never even reaches Vercel's infrastructure. This reduced the billable requests to zero.
Lessons for Developers
- Observability is Key: Set up billing alerts. Vercel allows you to set "Spend Management" limits. Enable them. If I hadn't checked my dashboard, this could have gone on for months.
- Middleware is Not Free: Remember that middleware runs on every request unless configured otherwise via a
matcher. Be careful what you put there. - Defense in Depth: Don't rely on your application code to filter bad traffic. Use infrastructure-level tools like Cloudflare or AWS WAF to scrub traffic before it costs you money.
As AI agents and crawlers become more aggressive, "defensive hosting" is becoming a required skill for modern web development. Don't let a bot eat your profit margin.
🛠️Web Development Tools You Might Like
Tags
📬 Get notified about new tools & tutorials
No spam. Unsubscribe anytime.
Comments (0)
Leave a Comment
No comments yet. Be the first to share your thoughts!