How We Stopped a Hidden Traffic Spike From Crashing a Drupal Website
Running a Drupal website in production means dealing with more than just content and design. Behind the scenes, every visit triggers PHP processes, database queries, caches, and complex infrastructure decisions. Most of the time, everything works smoothly — until suddenly it doesn’t.
Recently, one of our Drupal websites started behaving oddly:
Pages became extremely slow
Server CPU spiked
PHP-FPM (the PHP engine) created a huge number of worker processes
Some requests even failed completely
MongoDB backend became overloaded
At first glance, this looked like a hardware or code issue. But the real cause was much more interesting — and surprising.
The Mystery: Why Was the Website Suddenly Overwhelmed?
For several days, the site showed warnings like:
“PHP-FPM max children reached — consider raising it”
This usually means the server is receiving too many PHP requests at the same time.
But looking at the metrics, something didn’t make sense:
The server wasn’t near 1GB RAM
Only 6–8 PHP workers were running
Traffic didn’t look that high
So why did PHP-FPM suddenly try to spawn 20+ processes and become overloaded?
Even worse — some requests took far longer than normal, up to several seconds.
Digging Deeper: It Wasn’t Human Traffic
Eventually, we discovered the real problem:
Thousands of requests were hitting URLs that were not cached in Memcache.
These URLs required:
Full Drupal bootstrap
PHP processing
Multiple expensive MongoDB queries
Each request was legitimate in the sense that it used a normal user agent — not a bot signature.
Cloudflare therefore treated them as “real users” and did not block or slow them down.
But they were not real users.
They were coming from automated scripts or crawlers that Cloudflare didn’t detect.
Because the pages weren’t cached, every request caused heavy backend load.
And because they came in bursts, PHP-FPM panicked, spawning workers as fast as possible.
This created a chain reaction:
PHP-FPM process count exploded
MongoDB slowed down due to too many simultaneous queries
Drupal became slower
Some pages failed
A few legitimate users experienced downtime
The Fix: A Simple Cloudflare Rate Limit Rule
The solution turned out to be very simple — and extremely effective.
We added a Cloudflare Rate Limiting Rule:
Limit the number of requests a single IP can make to specific expensive URLs.
You can tune it for your own site, for example:
Allow 10 requests per minute
If exceeded → block, challenge, or slow down
Only apply it to heavy pages or endpoints
Log the IPs that trigger the rule
After enabling it, everything immediately stabilized:
No more PHP-FPM warnings
No more sudden spikes to max_children
MongoDB queries returned to normal
PHP execution time dropped back to expected levels
Users saw consistent fast performance again
Why This Matters for Any Website Owner
Even if you don’t know Drupal or server internals, this story shows a universal truth:
Not all traffic is good traffic. And not all harmful traffic looks like a bot.
Many Israeli businesses, NGOs, and institutions rely on Cloudflare, Drupal, or both.
But Cloudflare alone cannot catch everything — especially when traffic appears “normal” on the surface.
Rate limiting is an essential protection you must activate, especially for:
Search pages
Views with exposed filters
Node previews
API endpoints
Administrative or login areas
Pages with no caching
Anything that triggers expensive database queries
Without it, even a small scraping script can quietly overload your backend.
What You Can Do Today
If you run a Drupal site — or any site handled by PHP or a database — take these steps:
✔ 1. Identify uncached or heavy pages
These are your weak spots.
✔ 2. Add Cloudflare Rate Limiting
Start small (10–20 requests per minute per IP).
✔ 3. Monitor PHP-FPM process count
Spikes = suspicious traffic.
✔ 4. Monitor backend database queries
Slowdowns often come from too many parallel requests.
✔ 5. Consider caching strategies
Drupal supports:
Dynamic Page Cache
Memcache / Redis
Internal Page Cache
Reverse proxies
Just one uncached path can bring down the whole site.
One Final Lesson
Modern websites don’t crash because of thousands of real visitors.
They crash because of:
Scrapers
Broken bots
Over-eager crawlers
Misconfigured SEO tools
Automated systems that ignore robots.txt
These don’t always announce themselves as bots.
They look like human traffic — until your server collapses.
Cloudflare’s rate limiting turned out to be the simplest and most effective protection.
Want help tuning your own Drupal website?
If you’re running Drupal in Kubernetes, Docker, or any modern infrastructure, the right PHP-FPM and caching configuration makes all the difference.
Feel free to reach out — Someone of the Drupal community can help analyze your:
Cache efficiency
PHP-FPM configuration
Memory usage
MongoDB or MySQL performance
Cloudflare rules
Kubernetes resource tuning
Keeping your website fast and stable doesn’t need to be complicated — as long as you understand what’s really happening behind the scenes.