← Back to Blog

Fix Docker Container OOM Killed: Memory Limits and Debugging Guide

Quick Fix

Container getting OOM killed? Check the exit code and increase the memory limit:

# Confirm OOM kill
docker inspect  | grep -A 5 '"State"'
# Look for "OOMKilled": true, "ExitCode": 137

# Run with explicit memory limit
docker run -m 1g --memory-swap 1g your-image

# Or in docker-compose.yml
# deploy.resources.limits.memory: 1g

Your container exited with code 137. docker inspect shows "OOMKilled": true. The Linux kernel's OOM killer terminated your process because it exceeded its memory cgroup limit. This guide covers how to identify OOM kills, set correct memory limits, debug memory leaks in Java, Node.js, and PHP containers, configure swap, and handle cgroup v2 differences.

Confirming an OOM Kill

Exit code 137 means the process received SIGKILL (128 + 9). This is the OOM killer's signature, but SIGKILL can also come from docker kill or manual kill -9. To confirm the kernel OOM killer was responsible, use these three methods:

Method 1: docker inspect

The most reliable method. Docker records OOM events in the container state:

docker inspect  --format='{{json .State}}' | python3 -m json.tool

Look for these fields:

{
    "Status": "exited",
    "Running": false,
    "OOMKilled": true,
    "ExitCode": 137,
    "Error": "",
    "FinishedAt": "2026-04-02T14:23:18.442Z"
}

"OOMKilled": true is definitive. If it shows false but exit code is 137, the kill came from outside the container (another process sent SIGKILL).

Method 2: dmesg (Kernel Log)

The kernel logs OOM events with full detail including the process that was killed and the cgroup that triggered the OOM:

dmesg | grep -i "oom\|killed process" | tail -20

Typical output:

[482936.204] memory cgroup out of memory: Killed process 28451 (java)
  total-vm:4892176kB, anon-rss:524288kB, file-rss:45216kB
  oom_score_adj: 0
  Memory cgroup stats for /docker/a1b2c3d4e5f6:
  anon 536870912, file 46301184, kernel 8388608
  hierarchical_memory_limit 536870912

The hierarchical_memory_limit line shows the cgroup memory ceiling. The anon value shows actual anonymous memory usage. When anon hits the limit, the OOM killer fires.

Method 3: journalctl

On systemd hosts, OOM events are also captured in the journal:

# All OOM events in the last hour
journalctl -k --since "1 hour ago" | grep -i oom

# OOM events for a specific container
journalctl -k | grep "Memory cgroup" | grep 

Method 4: Docker Events

Docker emits oom events that you can watch in real time or query historically:

# Watch for OOM events in real time
docker events --filter event=oom

# Check recent events
docker events --since 1h --filter event=oom

Setting Memory Limits Correctly

Docker provides three memory-related flags. Understanding all three is critical to avoiding false OOM kills.

The -m / --memory Flag

Sets the hard memory limit. When the container exceeds this, the OOM killer fires.

# Hard limit of 512MB
docker run -m 512m your-image

# Same thing, long form
docker run --memory=512m your-image

The --memory-swap Flag

This flag is widely misunderstood. It sets the total memory + swap limit, not the swap amount:

  • -m 512m --memory-swap 1g = 512MB RAM + 512MB swap (total 1GB)
  • -m 512m --memory-swap 512m = 512MB RAM + 0 swap (swap disabled)
  • -m 512m --memory-swap -1 = 512MB RAM + unlimited swap
  • -m 512m (no --memory-swap) = 512MB RAM + 512MB swap (swap equals memory by default)

For production containers, explicitly disable swap to get predictable OOM behavior instead of degraded performance from swap thrashing:

docker run -m 1g --memory-swap 1g your-image

The --memory-reservation Flag

Sets a soft limit. Docker will attempt to reclaim memory from the container when the host is under memory pressure, but the container can exceed this limit. This is not a guarantee:

docker run -m 1g --memory-reservation 768m your-image

Useful for co-located containers where you want to signal priority. The container with the lower reservation gets reclaimed first.

Docker Compose Memory Limits

In Compose v3 with swarm mode, memory limits go under deploy.resources:

services:
  app:
    image: your-image
    deploy:
      resources:
        limits:
          memory: 1g
        reservations:
          memory: 512m

Important: The deploy key is ignored by docker compose up unless you pass --compatibility or use Docker Swarm. For standalone Compose, use the v2 syntax:

services:
  app:
    image: your-image
    mem_limit: 1g
    mem_reservation: 512m
    memswap_limit: 1g

Monitor Container Resource Usage

Use SecureBin's ENV Validator to verify your Docker environment variables and memory configuration before deployment.

Validate ENV Config

Finding Memory Leaks in Running Containers

Before increasing memory limits, determine whether your container has a legitimate memory need or a leak. Throwing more memory at a leak just delays the OOM kill.

docker stats

Real-time memory monitoring for running containers:

# All containers
docker stats

# Specific container, no-stream for a single snapshot
docker stats --no-stream 

# Custom format showing memory usage percentage
docker stats --format "table {{.Name}}\t{{.MemUsage}}\t{{.MemPerc}}"

Watch the MEM USAGE column over time. A steady climb that never stabilizes indicates a leak. A sawtooth pattern (rise then drop) indicates normal GC behavior.

Reading cgroup Memory Files Directly

For more granular data than docker stats provides, read the cgroup files directly:

# cgroup v2 (modern kernels, Docker 20.10+)
CGROUP_PATH=$(docker inspect  --format='{{.HostConfig.CgroupParent}}')
cat /sys/fs/cgroup/${CGROUP_PATH}/memory.current    # Current usage in bytes
cat /sys/fs/cgroup/${CGROUP_PATH}/memory.max         # Limit in bytes
cat /sys/fs/cgroup/${CGROUP_PATH}/memory.stat        # Detailed breakdown

# cgroup v1 (older kernels)
CGROUP_ID=$(docker inspect  --format='{{.Id}}')
cat /sys/fs/cgroup/memory/docker/${CGROUP_ID}/memory.usage_in_bytes
cat /sys/fs/cgroup/memory/docker/${CGROUP_ID}/memory.limit_in_bytes
cat /sys/fs/cgroup/memory/docker/${CGROUP_ID}/memory.stat

The memory.stat file breaks down usage into anonymous pages (heap), file-backed pages (cache), kernel memory, and more. For leak investigation, focus on anon (cgroup v2) or rss (cgroup v1) growing unbounded.

Process-Level Memory Inside the Container

# Exec into the container
docker exec -it  /bin/sh

# Check process memory
cat /proc/1/status | grep -i vm
# VmRSS = resident set size (physical memory)
# VmSize = virtual memory (includes mapped but unused pages)

# Or use top/htop if available
top -bn1 | head -20

Java Container OOM Patterns

Java is the most common source of container OOM kills because the JVM's memory model is more complex than a simple heap allocation.

The -Xmx Trap

Setting -Xmx equal to the container memory limit guarantees an OOM kill. The JVM uses significant memory outside the heap:

  • Metaspace: Class metadata, typically 50-200MB for large applications
  • Thread stacks: Default 1MB per thread. 200 threads = 200MB
  • Code cache: JIT compiled code, typically 48-240MB
  • Direct byte buffers: NIO buffers allocated outside the heap
  • Native memory: JNI allocations, compressed class pointers, GC data structures

Rule of thumb: set -Xmx to 75% of the container memory limit. For a 1GB container:

docker run -m 1g your-java-app \
  java -Xmx768m -Xms768m \
  -XX:MaxMetaspaceSize=128m \
  -XX:ReservedCodeCacheSize=64m \
  -jar app.jar

UseContainerSupport (JDK 10+)

Modern JVMs (JDK 10+) detect container memory limits and set heap size automatically:

# JDK 10+: enabled by default
java -XX:+UseContainerSupport \
     -XX:MaxRAMPercentage=75.0 \
     -jar app.jar

-XX:MaxRAMPercentage=75.0 tells the JVM to use 75% of the detected container memory for the heap. This is container-aware and more reliable than hardcoding -Xmx.

Gotcha: On JDK 8u131-191, the flag is -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap, but it only reads cgroup v1 files. If your host uses cgroup v2, the JVM sees the full host memory instead of the container limit. Upgrade to JDK 11+ for cgroup v2 support.

Native Memory Tracking

Enable NMT to see exactly where JVM memory is going:

# Enable NMT with summary level
docker run -m 1g your-java-app \
  java -XX:NativeMemoryTracking=summary -jar app.jar

# Then exec into the container and dump the report
docker exec  jcmd 1 VM.native_memory summary

Sample output showing memory breakdown:

Total: reserved=1842MB, committed=987MB
-                 Java Heap (reserved=768MB, committed=768MB)
-                     Class (reserved=136MB, committed=98MB)
-                    Thread (reserved=142MB, committed=142MB)
-                      Code (reserved=64MB, committed=52MB)
-                        GC (reserved=38MB, committed=38MB)
-                  Internal (reserved=12MB, committed=12MB)
-                    Symbol (reserved=8MB, committed=8MB)

Node.js Container OOM Patterns

Node.js uses V8's heap with a default max size of approximately 1.5GB on 64-bit systems (varies by V8 version). In a container with a 512MB limit, Node.js will attempt to allocate up to 1.5GB and get OOM killed long before V8 triggers its own GC pressure.

Set --max-old-space-size

# Set V8 heap to 384MB in a 512MB container
docker run -m 512m your-node-app \
  node --max-old-space-size=384 server.js

In your Dockerfile:

ENV NODE_OPTIONS="--max-old-space-size=384"
CMD ["node", "server.js"]

Detecting Node.js Memory Leaks

Use the built-in --inspect flag and Chrome DevTools for heap snapshots:

# Run with inspector
docker run -m 512m -p 9229:9229 your-node-app \
  node --inspect=0.0.0.0:9229 --max-old-space-size=384 server.js

Then open chrome://inspect in Chrome, connect to the container, and take heap snapshots. Compare snapshots taken minutes apart. Objects that grow between snapshots without corresponding request load are leaks.

Common Node.js leak patterns:

  • Event listener accumulation: Calling .on() without .off() in request handlers
  • Closure captures: Closures referencing large objects that outlive the request
  • Global caches without eviction: In-memory caches that grow without TTL or LRU
  • Unresolved promises: Promises that never resolve or reject hold their closure scope

PHP Container OOM Patterns

PHP processes typically have shorter lifecycles than Java or Node.js, but PHP-FPM worker pools can consume significant aggregate memory.

PHP memory_limit vs Container Limit

PHP's memory_limit controls per-script memory. But a PHP-FPM pool with 20 workers, each using 128MB, needs 2.5GB for workers alone plus memory for the FPM master process and the OS.

# Calculate required container memory
# Workers * memory_limit + FPM master (50MB) + OS overhead (100MB)
# 10 workers * 128MB + 50MB + 100MB = 1.43GB

docker run -m 1536m your-php-app

PHP-FPM Tuning for Containers

In php-fpm.conf or the pool config (www.conf):

; Use static pool sizing in containers (not dynamic)
pm = static
pm.max_children = 10

; Kill workers that leak memory
pm.max_requests = 500

pm = static is preferred in containers because dynamic scaling adds unpredictable memory spikes. pm.max_requests recycles workers after N requests, which mitigates PHP extensions that leak native memory (common with older MySQL or image processing extensions).

Swap Configuration

Swap in containers is a tradeoff: it prevents OOM kills but causes severe performance degradation when the container starts swapping. For most production workloads, disabling swap and sizing the memory limit correctly is the better approach.

Disable Swap Entirely

# Set --memory-swap equal to --memory to disable swap
docker run -m 1g --memory-swap 1g your-image

Allow Limited Swap

# 1GB RAM + 512MB swap = 1.5GB total
docker run -m 1g --memory-swap 1536m your-image

Control Swappiness

# Reduce swap preference (0-100, lower = less swapping)
docker run -m 1g --memory-swappiness=10 your-image

Note: --memory-swappiness requires swap to be enabled on the host. If the host has swap disabled (swapoff -a), this flag has no effect.

Cgroup v2 Gotchas

Docker 20.10+ supports cgroup v2 (unified hierarchy). Most current Linux distributions (Ubuntu 22.04+, Fedora 31+, Debian 11+) default to cgroup v2. This causes several behavior changes that break assumptions from the cgroup v1 era.

Check Which Cgroup Version You Are Running

# If this file exists, you are on cgroup v2
stat /sys/fs/cgroup/cgroup.controllers 2>/dev/null && echo "cgroup v2" || echo "cgroup v1"

# Or check the filesystem type
mount | grep cgroup
# cgroup v2: cgroup2 on /sys/fs/cgroup type cgroup2
# cgroup v1: cgroup on /sys/fs/cgroup/memory type cgroup

Key Differences

  • No --kernel-memory flag: Cgroup v2 removed kernel memory limits. Docker silently ignores --kernel-memory on v2. If your Compose file sets kernel_memory, it will be ignored without warning.
  • memory.high (soft limit): Cgroup v2 introduces memory.high as a throttling threshold. When usage exceeds memory.high, the kernel applies reclaim pressure (slowing allocations) before reaching memory.max (hard limit). Docker exposes this via --memory-reservation.
  • Separate swap accounting: In v1, memory.memsw.limit_in_bytes was memory + swap combined. In v2, memory.swap.max controls swap independently. Docker translates --memory-swap correctly, but if you read cgroup files directly, the semantics differ.
  • OOM kill behavior: Cgroup v2 sends SIGKILL to the entire cgroup, not just the single process that triggered the OOM. In v1, only the offending process was killed. This matters for containers running multiple processes (e.g., supervisord).
  • JDK 8 does not read cgroup v2: As mentioned above, JDK 8 reads v1 files only. On a cgroup v2 host, JDK 8 sees the full host memory and may set its heap too high, causing OOM kills.

Force Cgroup v1 (Workaround)

If you must run legacy applications that do not support cgroup v2, you can force v1 via a kernel boot parameter:

# Add to GRUB_CMDLINE_LINUX in /etc/default/grub
GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=0"

# Then update GRUB and reboot
sudo update-grub
sudo reboot

This is a temporary workaround. The correct fix is to update your runtime (JDK 11+, Node 12.17+, etc.) to a version that reads cgroup v2 files.

Generate Secure Docker Configs

Use SecureBin's Docker to Compose converter to generate properly structured Compose files with memory limits.

Convert Docker to Compose

Preventing OOM Kills in Production

A checklist for container memory configuration in production environments:

  1. Always set explicit memory limits. A container without -m can use all host memory, potentially OOM killing other containers or the host itself.
  2. Profile memory under realistic load. Run load tests and observe peak memory via docker stats or Prometheus. Set the limit to 125-150% of observed peak.
  3. Set runtime-specific heap limits. Java: MaxRAMPercentage=75. Node.js: --max-old-space-size at 75% of container limit. PHP: memory_limit * workers + overhead.
  4. Disable swap in production. Set --memory-swap equal to --memory. Swap masks memory issues and causes unpredictable latency.
  5. Monitor memory trends. A container that slowly climbs from 40% to 95% over days has a leak. Fix the leak, do not increase the limit.
  6. Set up OOM alerts. Use Docker events, Prometheus alerts, or Datadog monitors to alert on container OOM kills immediately.
  7. Use --oom-kill-disable with extreme caution. This flag prevents the OOM killer from killing the container, but the host may kill other processes instead. Only use this for critical single-container hosts.

The Bottom Line

Exit code 137 with "OOMKilled": true means your container exceeded its memory cgroup limit. The fix is not always "give it more memory." First, profile the actual memory usage. Second, check that your runtime (JVM, V8, PHP-FPM) is configured to respect container limits. Third, investigate leaks by comparing memory over time. Only after confirming the workload genuinely needs more memory should you increase the limit. And always set explicit limits, because a container without memory limits is a host OOM event waiting to happen.

Related Articles

Continue reading: Fix Kubernetes CrashLoopBackOff, Fix AWS EFS Permission Denied, Kubernetes Secrets Management, Fix Let's Encrypt Renewal Failed, How to Secure API Keys in Code.

UK
Written by Usman Khan
DevOps Engineer | MSc Cybersecurity | CEH | AWS Solutions Architect

Usman has 10+ years of experience securing enterprise infrastructure, managing high-traffic servers, and building zero-knowledge security tools. Read more about the author.