PHP-FPM Tuning for High Traffic: The Production Guide That Saved Our Servers
I have spent the last decade tuning PHP-FPM pools for everything from small WordPress blogs to Magento stores handling thousands of requests per second. The default configuration ships broken for production. Here is everything I have learned about getting it right.
TL;DR
Calculate: Available RAM / Average PHP worker memory = pm.max_children
Use pm = static for dedicated servers with consistent traffic.
Use pm = dynamic for shared servers where memory needs to be freed during quiet periods.
Use pm = ondemand for low traffic sites or development environments.
Always set pm.max_requests = 500 to prevent memory leaks from killing you slowly.
Why the Default PHP-FPM Config Kills Production
Let me tell you about the worst 3 AM wake-up call I ever got. A Magento 2 store was returning 502 Bad Gateway errors during a flash sale. The server had 16 GB of RAM, a fast NVMe drive, and a well-tuned MySQL instance. The problem? PHP-FPM was still running its default configuration.
Here is what the default pool config looks like on most Linux distributions:
pm = dynamic
pm.max_children = 5
pm.start_servers = 2
pm.min_spare_servers = 1
pm.max_spare_servers = 3
Read that again. Five max children. That means your server can handle exactly five concurrent PHP requests. Request number six sits in the listen queue. If the queue fills up, nginx or Apache returns a 502. On a server with 16 GB of RAM, you are using maybe 250 MB for PHP workers while the rest of your memory sits idle, doing absolutely nothing useful.
The default config exists because distribution maintainers have to pick values that work on a 512 MB VPS running 15 different services. It is the safest possible default. But if you are running a dedicated web server, or even a container that only runs PHP, those defaults are actively hurting you.
I have seen this pattern dozens of times: a team spends weeks optimizing database queries, adding Redis caching, and fine-tuning Varnish VCL rules, but nobody ever looks at the PHP-FPM pool config. Then traffic spikes and everything falls over because the bottleneck was always the five-worker limit.
Step 1: Find Your Per-Worker Memory Usage
Before you change anything, you need to know how much memory each PHP-FPM worker actually uses. This varies wildly depending on your application. A simple Laravel API might use 20 MB per worker. A Magento 2 instance with 200+ modules can easily consume 80 to 120 MB per worker. If you are dealing with PHP memory exhaustion issues, this step is especially critical.
The simplest way to measure is with ps:
# Show RSS (Resident Set Size) for all PHP-FPM workers
ps -C php-fpm -o pid,rss,command --sort=-rss
# Get the average in MB
ps -C php-fpm -o rss= | awk '{ total += $1; count++ } END { print total/count/1024 " MB" }'
Run this during peak traffic, not at 2 AM when nobody is on the site. You want to measure memory usage under real load with real requests hitting your actual application code.
For a more precise measurement from the PHP side, add this to a frequently-hit endpoint temporarily:
// Add temporarily to measure peak memory
error_log('Peak memory: ' . memory_get_peak_usage(true) / 1024 / 1024 . ' MB');
Check your PHP error log after a few hours of traffic and look at the distribution. You want the 95th percentile, not the average, because you need to plan for the heavy requests, not the lightweight ones. If most requests use 40 MB but your product import endpoint uses 200 MB, you need to account for that.
Pro tip: If you are running multiple pools (say, one for your storefront and one for your admin panel), measure them separately. Admin operations in Magento or WordPress can use 3 to 5 times more memory than frontend requests.
Step 2: Calculate pm.max_children
This is the single most important PHP-FPM setting. Get this wrong and you either waste memory or starve your application of workers. The formula is straightforward:
pm.max_children = (Total RAM - Reserved RAM) / Per-worker memory
Let me walk through a real example. Say you have an 8 GB server running nginx, PHP-FPM, and Redis (with MySQL on a separate RDS instance):
- Total RAM: 8192 MB
- OS + kernel buffers: 512 MB
- Nginx: 128 MB
- Redis: 512 MB
- Safety buffer: 512 MB (you never want to hit swap)
- Available for PHP-FPM: 8192 - 512 - 128 - 512 - 512 = 6528 MB
- Average worker memory: 50 MB (measured from Step 1)
- pm.max_children: 6528 / 50 = 130
Round down, not up. I would set this to 120 to leave a comfortable margin. Hitting swap is catastrophic for PHP-FPM performance. When a PHP worker starts swapping, its response time goes from 200 ms to 5 seconds, and that worker is now holding a connection slot hostage for 25 times longer than it should. One swapping worker causes a cascading failure because it holds its slot while other requests queue up.
Here is a quick reference table for common configurations:
| Server RAM | Reserved for OS/Services | Worker Memory | pm.max_children |
|---|---|---|---|
| 2 GB | 768 MB | 40 MB | 30 |
| 4 GB | 1 GB | 50 MB | 60 |
| 8 GB | 2 GB | 50 MB | 120 |
| 16 GB | 3 GB | 60 MB | 210 |
| 32 GB | 4 GB | 80 MB | 350 |
These are starting points, not gospel. Measure, test, adjust. Every application is different.
Step 3: Choose Your PM Mode
PHP-FPM offers three process management modes. Each has a specific use case, and picking the wrong one costs you either performance or memory. Here is the honest comparison:
| Feature | static | dynamic | ondemand |
|---|---|---|---|
| How it works | All workers forked at startup, always running | Workers scale between min and max based on demand | Workers only created when requests arrive, killed after idle timeout |
| Memory usage | Constant (max_children x worker size) | Variable (fluctuates with traffic) | Minimal when idle, grows under load |
| First request latency | None (workers always ready) | None if spare servers available | Higher (must fork a new process) |
| Response time under load | Best (no forking overhead) | Good (occasional fork penalty) | Worst (frequent forking) |
| Memory efficiency | Worst (reserved even when idle) | Good (releases workers during quiet periods) | Best (zero memory when idle) |
| Best for | Dedicated web servers, high traffic, containers | Shared servers, variable traffic | Dev environments, very low traffic, cron-only pools |
My recommendation
If the server or container exists primarily to serve PHP, use pm = static. Period. The memory is allocated to PHP-FPM whether you use static or dynamic mode. With dynamic, you just add the overhead of constantly forking and killing processes. With static, the workers are warm, the opcode cache is populated per-worker, and you get consistent performance.
I only recommend pm = dynamic when the machine genuinely runs other services that need memory during off-peak hours. A server that runs PHP-FPM, MySQL, Redis, and Elasticsearch might legitimately benefit from dynamic mode because those services can use the freed memory during quiet periods.
I have never recommended pm = ondemand for a production web-facing pool. The fork-on-every-request overhead is measurable. It adds 5 to 20 ms of latency per request on a cold worker, which compounds fast under concurrent load. It is fine for admin panels, cron runners, or development.
Step 4: Configure Pool Settings
Once you have chosen your mode and calculated max_children, here is how to configure the full pool. I will show both static and dynamic since those are the production-relevant options.
Static mode (recommended for dedicated servers)
[www]
user = www-data
group = www-data
listen = /run/php/php-fpm.sock
listen.owner = www-data
listen.group = www-data
; Process management
pm = static
pm.max_children = 120
; Prevent memory leaks (critical for Magento/WordPress)
pm.max_requests = 500
; Status and monitoring
pm.status_path = /fpm-status
ping.path = /fpm-ping
ping.response = pong
; Slow log for debugging
request_slowlog_timeout = 5s
slowlog = /var/log/php-fpm/slow.log
; Terminate stuck requests after 30 seconds
request_terminate_timeout = 30s
Dynamic mode (for shared servers)
[www]
user = www-data
group = www-data
listen = /run/php/php-fpm.sock
listen.owner = www-data
listen.group = www-data
; Process management
pm = dynamic
pm.max_children = 120
pm.start_servers = 30
pm.min_spare_servers = 20
pm.max_spare_servers = 40
; Prevent memory leaks
pm.max_requests = 500
; Status and monitoring
pm.status_path = /fpm-status
ping.path = /fpm-ping
ping.response = pong
; Slow log
request_slowlog_timeout = 5s
slowlog = /var/log/php-fpm/slow.log
; Terminate stuck requests
request_terminate_timeout = 30s
A few notes on the dynamic mode values:
- pm.start_servers: How many workers to fork on startup. Set this to your average concurrency. If you typically see 30 concurrent PHP requests, start with 30.
- pm.min_spare_servers: PHP-FPM will kill workers until it gets down to this number of idle workers. Set this high enough that a small traffic burst does not require forking.
- pm.max_spare_servers: PHP-FPM will kill workers if idle count exceeds this. Keep it reasonable so you are not wasting memory during quiet periods.
A common rule of thumb: start_servers = (min_spare + max_spare) / 2. But honestly, just look at your traffic patterns and set start_servers to your typical concurrency.
Step 5: Enable pm.status_path for Monitoring
Flying blind is how you end up getting paged at 3 AM. Enable the status page and check it regularly. Add this to your PHP-FPM pool config (already included in the examples above):
pm.status_path = /fpm-status
Then expose it through nginx on a restricted endpoint:
# Nginx config - restrict to internal/monitoring IPs
location = /fpm-status {
allow 127.0.0.1;
allow 10.0.0.0/8; # Internal network
deny all;
fastcgi_pass unix:/run/php/php-fpm.sock;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include fastcgi_params;
}
Hit that endpoint and you will get output like this:
pool: www
process manager: static
start time: 12/Apr/2026:09:15:32 +0000
start since: 86423
accepted conn: 1284503
listen queue: 0
max listen queue: 12
listen queue len: 4096
idle processes: 87
active processes: 33
total processes: 120
max active processes: 118
max children reached: 0
slow requests: 47
The metrics that matter most:
- listen queue: If this is consistently above 0, you need more workers. Requests are waiting.
- max children reached: If this is non-zero, you have hit your ceiling. Either increase max_children or optimize your application.
- active processes vs total processes: This is your utilization ratio. If active is consistently close to total, you are running hot.
- slow requests: Number of requests that exceeded your
request_slowlog_timeout. Investigate these.
Step 6: Slow Log Configuration
The slow log is one of the most underrated debugging tools in the PHP ecosystem. When a request takes longer than the threshold, PHP-FPM dumps a full stack trace showing exactly where the code was spending time. This is invaluable for finding database queries that need indexing, external API calls that are timing out, or file operations that are blocking.
; Log any request taking longer than 5 seconds
request_slowlog_timeout = 5s
slowlog = /var/log/php-fpm/slow.log
A typical slow log entry looks like this:
[12-Apr-2026 14:23:45] [pool www] pid 18432
script_filename = /var/www/html/pub/index.php
[0x00007f2a3c014320] execute() /var/www/html/vendor/magento/framework/DB/Adapter/Pdo/Mysql.php:94
[0x00007f2a3c014120] query() /var/www/html/vendor/magento/framework/DB/Statement/Pdo/Mysql.php:91
[0x00007f2a3c013f20] _execute() /var/www/html/vendor/magento/zendframework1/library/Zend/Db/Statement.php:303
That tells you exactly which database query was slow and the full call stack leading to it. Set the timeout based on your application's normal response time. If most requests complete in 200 ms, a 5-second threshold catches genuinely problematic requests without flooding your log with noise.
Important: Make sure the log directory exists and is writable by the PHP-FPM user. Also set up log rotation or you will end up with a multi-gigabyte log file that fills your disk.
Pro Tip: pm.max_requests Prevents Silent Memory Leaks
This is the setting that saves you from yourself and from every third-party library you depend on. pm.max_requests tells PHP-FPM to gracefully kill and respawn a worker after it has handled N requests. This prevents memory leaks from slowly consuming all your server memory.
pm.max_requests = 500
Why 500? It is a balance. Too low (like 100) and you spend too much time forking new workers and warming opcode caches. Too high (like 10000) and a worker with a slow memory leak can grow from 50 MB to 300 MB before it gets recycled. I have found 500 to be the sweet spot for most applications.
For Magento 2, this is absolutely critical. Magento's dependency injection container, event observer system, and plugin interceptors all accumulate objects in memory. I have watched Magento workers grow from 80 MB to over 400 MB after handling 2000 requests without recycling. That is the kind of thing that turns a healthy server into a swapping mess at 2 PM on a Tuesday.
For WordPress with a lot of plugins, the same principle applies. Some plugins allocate memory in hooks and never free it. A max_requests of 500 keeps workers lean.
Setting pm.max_requests = 0 (unlimited) is never a good idea in production. I do not care how clean you think your code is. Dependencies leak memory. Extensions leak memory. Even PHP itself can have subtle leaks in certain configurations. Always set a limit.
Kubernetes and Docker PHP-FPM Specifics
If you are running PHP-FPM in containers, the rules change slightly. The big gotcha is that PHP-FPM does not automatically know about container memory limits. It sees the host machine's total memory, not the cgroup limit.
Container memory limits vs pm.max_children
Say your Kubernetes pod has a memory limit of 512 Mi and your PHP workers use 50 MB each. You might think you can run 10 workers, but you also need memory for the master process, the opcode cache, and the PHP runtime itself. In practice:
# Container memory limit: 512Mi
# PHP-FPM master process: ~30 MB
# OPcache shared memory: 128 MB (php.ini opcache.memory_consumption)
# Available for workers: 512 - 30 - 128 = 354 MB
# Per worker: 50 MB
# pm.max_children: 354 / 50 = 7 (round down to 6 for safety)
If you set max_children too high for your container limit, the OOM killer will terminate PHP-FPM with no warning. Your pod restarts, active requests get dropped, and you get a 502 error cascade. I have seen this happen in production more times than I want to admit.
Liveness and readiness probes
Use the ping.path directive as your liveness probe and the status page for readiness:
# In your PHP-FPM pool config
ping.path = /fpm-ping
ping.response = pong
pm.status_path = /fpm-status
# Kubernetes deployment spec
livenessProbe:
tcpSocket:
port: 9000
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
exec:
command:
- php-fpm-healthcheck # or a custom script checking /fpm-ping
initialDelaySeconds: 5
periodSeconds: 5
Use static mode in containers
In a container, you have a fixed memory budget. There is no reason to use dynamic mode because there are no other services competing for memory inside the container. Use pm = static and set max_children to fill your memory budget. The container orchestrator handles scaling by adding more pods, not by growing workers within a pod.
Monitoring with Grafana and Prometheus
For production monitoring at scale, the FPM status page is a good start but you want proper time-series data. The php-fpm_exporter from hipages scrapes the status page and exposes metrics in Prometheus format.
Key metrics to alert on:
- phpfpm_listen_queue > 0 for more than 30 seconds: Workers are saturated, requests are queuing
- phpfpm_max_children_reached incrementing: You have hit your worker ceiling
- phpfpm_active_processes / phpfpm_total_processes > 0.85: Running above 85% utilization, scale soon
- phpfpm_slow_requests rate > 1/min: Application performance is degrading
Set up a Grafana dashboard with these four panels and you will catch PHP-FPM issues before they become user-facing problems. I have a standing rule: if active processes exceed 80% of total processes for more than 5 minutes, we either scale horizontally (add pods) or vertically (increase memory and max_children).
Real World Example: From 502 Errors to 1000 req/s
Let me walk you through an actual production tuning session. The environment was a Magento 2 store on an 8 GB dedicated server behind Varnish. During a promotional event, the site was throwing 502 errors at roughly 200 concurrent users.
Before tuning:
pm = dynamic
pm.max_children = 10
pm.start_servers = 4
pm.min_spare_servers = 2
pm.max_spare_servers = 6
pm.max_requests = 0 # unlimited - bad
Worker memory had ballooned to 180 MB each because max_requests was unlimited. With 10 workers at 180 MB, PHP-FPM was using 1.8 GB and was completely saturated. Every additional request queued up until nginx timed out and returned a 502.
After tuning:
pm = static
pm.max_children = 80
pm.max_requests = 500
request_slowlog_timeout = 5s
request_terminate_timeout = 30s
We also set opcache.memory_consumption = 256 and opcache.max_accelerated_files = 20000 because Magento has thousands of PHP files.
Results:
- Worker memory dropped from 180 MB to 65 MB (recycling via max_requests)
- Total PHP-FPM memory: 80 x 65 MB = 5.2 GB (within our 6 GB budget)
- Listen queue: consistently 0 (was previously hitting 50+)
- 502 errors: zero
- Sustained throughput: 1000+ requests per second (Varnish hit rate was about 85%, so roughly 150 req/s actually reached PHP-FPM)
- Average response time for PHP requests: 180 ms (down from 2+ seconds)
The total tuning time was about 30 minutes. The impact was enormous. Most of the improvement came from simply giving PHP-FPM enough workers to handle the load without queuing.
Six Common Mistakes to Avoid
- Never measuring per-worker memory. Guessing leads to either OOM kills (too many workers) or wasted resources (too few). Always measure with
psunder real load before setting max_children. - Setting pm.max_requests = 0. This is the default on many distributions and it means workers never recycle. Memory leaks accumulate indefinitely. Always set a value between 200 and 1000.
- Using ondemand mode for production traffic. The fork-per-request overhead is real. Use static or dynamic for any pool that handles web traffic. Save ondemand for cron pools or admin panels with infrequent access.
- Forgetting to account for OPcache shared memory. OPcache allocates a shared memory segment (default 128 MB) that does not show up in per-worker measurements but absolutely counts against your memory budget. Check your
opcache.memory_consumptionsetting. - Not setting request_terminate_timeout. Without this, a stuck PHP request (infinite loop, deadlocked database query) holds a worker hostage forever. Set it to something reasonable like 30 or 60 seconds. This is your safety net.
- Ignoring the listen queue. The listen queue is your most important metric. A queue length above 0 means requests are waiting for a worker. Sustained queuing means you need more workers, a faster application, or both. Monitor this metric and alert on it.
Frequently Asked Questions
What is the best pm.max_children value for my server?
There is no universal number. The formula is: (Total server RAM - RAM reserved for OS and other services) / Average PHP worker memory usage. For example, on an 8 GB server where each PHP worker uses 50 MB and you reserve 2 GB for the OS, the calculation is (8192 - 2048) / 50 = 122 max_children. Always measure your actual per-worker memory with ps aux or memory_get_peak_usage() rather than guessing. Start conservative, monitor for a few days, and adjust. If you never see the listen queue above zero and active processes stay below 70% of total, you probably have headroom to spare.
Should I use pm static or pm dynamic for a high traffic site?
For dedicated web servers handling consistent high traffic, pm = static is almost always the better choice. It pre-forks all workers at startup so there is zero overhead when requests arrive. Use pm = dynamic only on shared servers where you need to free memory for other services during quiet periods. The performance difference between static and dynamic is small during steady traffic, but during traffic spikes, static mode handles the burst instantly while dynamic mode spends precious milliseconds forking new workers. In a Kubernetes environment, always use static because the container has a fixed memory budget and the orchestrator handles scaling.
How do I monitor PHP-FPM worker utilization in production?
Enable pm.status_path in your PHP-FPM pool config, then expose it through your web server on a restricted endpoint. The status page shows active processes, idle workers, the listen queue length, and max children reached count. For Grafana and Prometheus setups, use the php-fpm_exporter to scrape these metrics and build dashboards. The most critical metric to watch is the listen queue: if it is consistently above zero, you need more workers. Set up alerts for listen_queue > 0 sustained over 30 seconds and for active_processes/total_processes exceeding 85%.
Check Your Server's Exposure
A well-tuned PHP-FPM pool does not help if your server is leaking sensitive configuration files. Run a free exposure scan to check for .env files, open ports, and misconfigured headers.
Scan Your Server FreeThe Bottom Line
PHP-FPM tuning is not complicated, but it requires measurement and intentionality. The defaults are wrong for production. Measure your per-worker memory, calculate max_children based on your actual resources, choose the right PM mode for your workload, set max_requests to prevent memory leaks, and monitor your listen queue religiously.
I have tuned PHP-FPM pools on bare metal, VMs, and Kubernetes clusters. The principles are always the same. Give PHP-FPM enough workers to handle your concurrency, do not let workers leak memory, and watch the metrics that tell you when you are running out of headroom.
If you are seeing 502 errors or memory exhaustion under load, the fix is probably in your PHP-FPM config. Start with the formula. Measure, calculate, deploy, monitor. That is the whole process.
Related tools: Exposure Checker, Nginx Config Generator, Cron Parser, Docker to Compose, and 70+ more free tools.
Usman has 10+ years of experience securing enterprise infrastructure, managing high-traffic servers, and building zero-knowledge security tools. Read more about the author.