Fix Docker Container OOM Killed: Memory Limits and Debugging Guide
Quick Fix
Container getting OOM killed? Check the exit code and increase the memory limit:
# Confirm OOM kill
docker inspect | grep -A 5 '"State"'
# Look for "OOMKilled": true, "ExitCode": 137
# Run with explicit memory limit
docker run -m 1g --memory-swap 1g your-image
# Or in docker-compose.yml
# deploy.resources.limits.memory: 1g
Your container exited with code 137. docker inspect shows "OOMKilled": true. The Linux kernel's OOM killer terminated your process because it exceeded its memory cgroup limit. This guide covers how to identify OOM kills, set correct memory limits, debug memory leaks in Java, Node.js, and PHP containers, configure swap, and handle cgroup v2 differences.
Confirming an OOM Kill
Exit code 137 means the process received SIGKILL (128 + 9). This is the OOM killer's signature, but SIGKILL can also come from docker kill or manual kill -9. To confirm the kernel OOM killer was responsible, use these three methods:
Method 1: docker inspect
The most reliable method. Docker records OOM events in the container state:
docker inspect --format='{{json .State}}' | python3 -m json.tool
Look for these fields:
{
"Status": "exited",
"Running": false,
"OOMKilled": true,
"ExitCode": 137,
"Error": "",
"FinishedAt": "2026-04-02T14:23:18.442Z"
}
"OOMKilled": true is definitive. If it shows false but exit code is 137, the kill came from outside the container (another process sent SIGKILL).
Method 2: dmesg (Kernel Log)
The kernel logs OOM events with full detail including the process that was killed and the cgroup that triggered the OOM:
dmesg | grep -i "oom\|killed process" | tail -20
Typical output:
[482936.204] memory cgroup out of memory: Killed process 28451 (java)
total-vm:4892176kB, anon-rss:524288kB, file-rss:45216kB
oom_score_adj: 0
Memory cgroup stats for /docker/a1b2c3d4e5f6:
anon 536870912, file 46301184, kernel 8388608
hierarchical_memory_limit 536870912
The hierarchical_memory_limit line shows the cgroup memory ceiling. The anon value shows actual anonymous memory usage. When anon hits the limit, the OOM killer fires.
Method 3: journalctl
On systemd hosts, OOM events are also captured in the journal:
# All OOM events in the last hour
journalctl -k --since "1 hour ago" | grep -i oom
# OOM events for a specific container
journalctl -k | grep "Memory cgroup" | grep
Method 4: Docker Events
Docker emits oom events that you can watch in real time or query historically:
# Watch for OOM events in real time
docker events --filter event=oom
# Check recent events
docker events --since 1h --filter event=oom
Setting Memory Limits Correctly
Docker provides three memory-related flags. Understanding all three is critical to avoiding false OOM kills.
The -m / --memory Flag
Sets the hard memory limit. When the container exceeds this, the OOM killer fires.
# Hard limit of 512MB
docker run -m 512m your-image
# Same thing, long form
docker run --memory=512m your-image
The --memory-swap Flag
This flag is widely misunderstood. It sets the total memory + swap limit, not the swap amount:
-m 512m --memory-swap 1g= 512MB RAM + 512MB swap (total 1GB)-m 512m --memory-swap 512m= 512MB RAM + 0 swap (swap disabled)-m 512m --memory-swap -1= 512MB RAM + unlimited swap-m 512m(no --memory-swap) = 512MB RAM + 512MB swap (swap equals memory by default)
For production containers, explicitly disable swap to get predictable OOM behavior instead of degraded performance from swap thrashing:
docker run -m 1g --memory-swap 1g your-image
The --memory-reservation Flag
Sets a soft limit. Docker will attempt to reclaim memory from the container when the host is under memory pressure, but the container can exceed this limit. This is not a guarantee:
docker run -m 1g --memory-reservation 768m your-image
Useful for co-located containers where you want to signal priority. The container with the lower reservation gets reclaimed first.
Docker Compose Memory Limits
In Compose v3 with swarm mode, memory limits go under deploy.resources:
services:
app:
image: your-image
deploy:
resources:
limits:
memory: 1g
reservations:
memory: 512m
Important: The deploy key is ignored by docker compose up unless you pass --compatibility or use Docker Swarm. For standalone Compose, use the v2 syntax:
services:
app:
image: your-image
mem_limit: 1g
mem_reservation: 512m
memswap_limit: 1g
Monitor Container Resource Usage
Use SecureBin's ENV Validator to verify your Docker environment variables and memory configuration before deployment.
Validate ENV ConfigFinding Memory Leaks in Running Containers
Before increasing memory limits, determine whether your container has a legitimate memory need or a leak. Throwing more memory at a leak just delays the OOM kill.
docker stats
Real-time memory monitoring for running containers:
# All containers
docker stats
# Specific container, no-stream for a single snapshot
docker stats --no-stream
# Custom format showing memory usage percentage
docker stats --format "table {{.Name}}\t{{.MemUsage}}\t{{.MemPerc}}"
Watch the MEM USAGE column over time. A steady climb that never stabilizes indicates a leak. A sawtooth pattern (rise then drop) indicates normal GC behavior.
Reading cgroup Memory Files Directly
For more granular data than docker stats provides, read the cgroup files directly:
# cgroup v2 (modern kernels, Docker 20.10+)
CGROUP_PATH=$(docker inspect --format='{{.HostConfig.CgroupParent}}')
cat /sys/fs/cgroup/${CGROUP_PATH}/memory.current # Current usage in bytes
cat /sys/fs/cgroup/${CGROUP_PATH}/memory.max # Limit in bytes
cat /sys/fs/cgroup/${CGROUP_PATH}/memory.stat # Detailed breakdown
# cgroup v1 (older kernels)
CGROUP_ID=$(docker inspect --format='{{.Id}}')
cat /sys/fs/cgroup/memory/docker/${CGROUP_ID}/memory.usage_in_bytes
cat /sys/fs/cgroup/memory/docker/${CGROUP_ID}/memory.limit_in_bytes
cat /sys/fs/cgroup/memory/docker/${CGROUP_ID}/memory.stat
The memory.stat file breaks down usage into anonymous pages (heap), file-backed pages (cache), kernel memory, and more. For leak investigation, focus on anon (cgroup v2) or rss (cgroup v1) growing unbounded.
Process-Level Memory Inside the Container
# Exec into the container
docker exec -it /bin/sh
# Check process memory
cat /proc/1/status | grep -i vm
# VmRSS = resident set size (physical memory)
# VmSize = virtual memory (includes mapped but unused pages)
# Or use top/htop if available
top -bn1 | head -20
Java Container OOM Patterns
Java is the most common source of container OOM kills because the JVM's memory model is more complex than a simple heap allocation.
The -Xmx Trap
Setting -Xmx equal to the container memory limit guarantees an OOM kill. The JVM uses significant memory outside the heap:
- Metaspace: Class metadata, typically 50-200MB for large applications
- Thread stacks: Default 1MB per thread. 200 threads = 200MB
- Code cache: JIT compiled code, typically 48-240MB
- Direct byte buffers: NIO buffers allocated outside the heap
- Native memory: JNI allocations, compressed class pointers, GC data structures
Rule of thumb: set -Xmx to 75% of the container memory limit. For a 1GB container:
docker run -m 1g your-java-app \
java -Xmx768m -Xms768m \
-XX:MaxMetaspaceSize=128m \
-XX:ReservedCodeCacheSize=64m \
-jar app.jar
UseContainerSupport (JDK 10+)
Modern JVMs (JDK 10+) detect container memory limits and set heap size automatically:
# JDK 10+: enabled by default
java -XX:+UseContainerSupport \
-XX:MaxRAMPercentage=75.0 \
-jar app.jar
-XX:MaxRAMPercentage=75.0 tells the JVM to use 75% of the detected container memory for the heap. This is container-aware and more reliable than hardcoding -Xmx.
Gotcha: On JDK 8u131-191, the flag is -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap, but it only reads cgroup v1 files. If your host uses cgroup v2, the JVM sees the full host memory instead of the container limit. Upgrade to JDK 11+ for cgroup v2 support.
Native Memory Tracking
Enable NMT to see exactly where JVM memory is going:
# Enable NMT with summary level
docker run -m 1g your-java-app \
java -XX:NativeMemoryTracking=summary -jar app.jar
# Then exec into the container and dump the report
docker exec jcmd 1 VM.native_memory summary
Sample output showing memory breakdown:
Total: reserved=1842MB, committed=987MB
- Java Heap (reserved=768MB, committed=768MB)
- Class (reserved=136MB, committed=98MB)
- Thread (reserved=142MB, committed=142MB)
- Code (reserved=64MB, committed=52MB)
- GC (reserved=38MB, committed=38MB)
- Internal (reserved=12MB, committed=12MB)
- Symbol (reserved=8MB, committed=8MB)
Node.js Container OOM Patterns
Node.js uses V8's heap with a default max size of approximately 1.5GB on 64-bit systems (varies by V8 version). In a container with a 512MB limit, Node.js will attempt to allocate up to 1.5GB and get OOM killed long before V8 triggers its own GC pressure.
Set --max-old-space-size
# Set V8 heap to 384MB in a 512MB container
docker run -m 512m your-node-app \
node --max-old-space-size=384 server.js
In your Dockerfile:
ENV NODE_OPTIONS="--max-old-space-size=384"
CMD ["node", "server.js"]
Detecting Node.js Memory Leaks
Use the built-in --inspect flag and Chrome DevTools for heap snapshots:
# Run with inspector
docker run -m 512m -p 9229:9229 your-node-app \
node --inspect=0.0.0.0:9229 --max-old-space-size=384 server.js
Then open chrome://inspect in Chrome, connect to the container, and take heap snapshots. Compare snapshots taken minutes apart. Objects that grow between snapshots without corresponding request load are leaks.
Common Node.js leak patterns:
- Event listener accumulation: Calling
.on()without.off()in request handlers - Closure captures: Closures referencing large objects that outlive the request
- Global caches without eviction: In-memory caches that grow without TTL or LRU
- Unresolved promises: Promises that never resolve or reject hold their closure scope
PHP Container OOM Patterns
PHP processes typically have shorter lifecycles than Java or Node.js, but PHP-FPM worker pools can consume significant aggregate memory.
PHP memory_limit vs Container Limit
PHP's memory_limit controls per-script memory. But a PHP-FPM pool with 20 workers, each using 128MB, needs 2.5GB for workers alone plus memory for the FPM master process and the OS.
# Calculate required container memory
# Workers * memory_limit + FPM master (50MB) + OS overhead (100MB)
# 10 workers * 128MB + 50MB + 100MB = 1.43GB
docker run -m 1536m your-php-app
PHP-FPM Tuning for Containers
In php-fpm.conf or the pool config (www.conf):
; Use static pool sizing in containers (not dynamic)
pm = static
pm.max_children = 10
; Kill workers that leak memory
pm.max_requests = 500
pm = static is preferred in containers because dynamic scaling adds unpredictable memory spikes. pm.max_requests recycles workers after N requests, which mitigates PHP extensions that leak native memory (common with older MySQL or image processing extensions).
Swap Configuration
Swap in containers is a tradeoff: it prevents OOM kills but causes severe performance degradation when the container starts swapping. For most production workloads, disabling swap and sizing the memory limit correctly is the better approach.
Disable Swap Entirely
# Set --memory-swap equal to --memory to disable swap
docker run -m 1g --memory-swap 1g your-image
Allow Limited Swap
# 1GB RAM + 512MB swap = 1.5GB total
docker run -m 1g --memory-swap 1536m your-image
Control Swappiness
# Reduce swap preference (0-100, lower = less swapping)
docker run -m 1g --memory-swappiness=10 your-image
Note: --memory-swappiness requires swap to be enabled on the host. If the host has swap disabled (swapoff -a), this flag has no effect.
Cgroup v2 Gotchas
Docker 20.10+ supports cgroup v2 (unified hierarchy). Most current Linux distributions (Ubuntu 22.04+, Fedora 31+, Debian 11+) default to cgroup v2. This causes several behavior changes that break assumptions from the cgroup v1 era.
Check Which Cgroup Version You Are Running
# If this file exists, you are on cgroup v2
stat /sys/fs/cgroup/cgroup.controllers 2>/dev/null && echo "cgroup v2" || echo "cgroup v1"
# Or check the filesystem type
mount | grep cgroup
# cgroup v2: cgroup2 on /sys/fs/cgroup type cgroup2
# cgroup v1: cgroup on /sys/fs/cgroup/memory type cgroup
Key Differences
- No --kernel-memory flag: Cgroup v2 removed kernel memory limits. Docker silently ignores
--kernel-memoryon v2. If your Compose file setskernel_memory, it will be ignored without warning. - memory.high (soft limit): Cgroup v2 introduces
memory.highas a throttling threshold. When usage exceedsmemory.high, the kernel applies reclaim pressure (slowing allocations) before reachingmemory.max(hard limit). Docker exposes this via--memory-reservation. - Separate swap accounting: In v1,
memory.memsw.limit_in_byteswas memory + swap combined. In v2,memory.swap.maxcontrols swap independently. Docker translates--memory-swapcorrectly, but if you read cgroup files directly, the semantics differ. - OOM kill behavior: Cgroup v2 sends SIGKILL to the entire cgroup, not just the single process that triggered the OOM. In v1, only the offending process was killed. This matters for containers running multiple processes (e.g., supervisord).
- JDK 8 does not read cgroup v2: As mentioned above, JDK 8 reads v1 files only. On a cgroup v2 host, JDK 8 sees the full host memory and may set its heap too high, causing OOM kills.
Force Cgroup v1 (Workaround)
If you must run legacy applications that do not support cgroup v2, you can force v1 via a kernel boot parameter:
# Add to GRUB_CMDLINE_LINUX in /etc/default/grub
GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=0"
# Then update GRUB and reboot
sudo update-grub
sudo reboot
This is a temporary workaround. The correct fix is to update your runtime (JDK 11+, Node 12.17+, etc.) to a version that reads cgroup v2 files.
Generate Secure Docker Configs
Use SecureBin's Docker to Compose converter to generate properly structured Compose files with memory limits.
Convert Docker to ComposePreventing OOM Kills in Production
A checklist for container memory configuration in production environments:
- Always set explicit memory limits. A container without
-mcan use all host memory, potentially OOM killing other containers or the host itself. - Profile memory under realistic load. Run load tests and observe peak memory via
docker statsor Prometheus. Set the limit to 125-150% of observed peak. - Set runtime-specific heap limits. Java:
MaxRAMPercentage=75. Node.js:--max-old-space-sizeat 75% of container limit. PHP:memory_limit * workers + overhead. - Disable swap in production. Set
--memory-swapequal to--memory. Swap masks memory issues and causes unpredictable latency. - Monitor memory trends. A container that slowly climbs from 40% to 95% over days has a leak. Fix the leak, do not increase the limit.
- Set up OOM alerts. Use Docker events, Prometheus alerts, or Datadog monitors to alert on container OOM kills immediately.
- Use --oom-kill-disable with extreme caution. This flag prevents the OOM killer from killing the container, but the host may kill other processes instead. Only use this for critical single-container hosts.
The Bottom Line
Exit code 137 with "OOMKilled": true means your container exceeded its memory cgroup limit. The fix is not always "give it more memory." First, profile the actual memory usage. Second, check that your runtime (JVM, V8, PHP-FPM) is configured to respect container limits. Third, investigate leaks by comparing memory over time. Only after confirming the workload genuinely needs more memory should you increase the limit. And always set explicit limits, because a container without memory limits is a host OOM event waiting to happen.
Related Articles
Continue reading: Fix Kubernetes CrashLoopBackOff, Fix AWS EFS Permission Denied, Kubernetes Secrets Management, Fix Let's Encrypt Renewal Failed, How to Secure API Keys in Code.
Usman has 10+ years of experience securing enterprise infrastructure, managing high-traffic servers, and building zero-knowledge security tools. Read more about the author.