Reading an OOM log
How to read and diagnose Out-of-Memory killer messages
What does an OOM log look like?
When the kernel runs out of memory and cannot reclaim enough pages, it invokes the OOM killer. The log is generated by functions in mm/oom_kill.c, primarily dump_header(), dump_tasks(), and __oom_kill_process().
The log has four sections:
- Trigger line — who invoked OOM and why
- Memory state dump — system or cgroup memory statistics
- Task table — all processes with their memory usage
- Kill line — which process was selected and killed
Annotated Example: System-Wide OOM
[ 621.423101] python3 invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
| Field | Meaning |
|---|---|
python3 |
The process whose allocation failed (not necessarily the victim) |
gfp_mask=0x140cca |
Allocation flags — GFP_HIGHUSER_MOVABLE means a userspace page allocation |
order=0 |
Requested a single page (2^0 = 1 page). Higher orders mean contiguous allocation. |
oom_score_adj=0 |
OOM adjustment of the triggering process (not the victim) |
Stack trace
[ 621.423118] <TASK>
[ 621.423137] oom_kill_process+0x101/0x170
[ 621.423143] out_of_memory+0x116/0x570
[ 621.423149] __alloc_pages_slowpath+0xc9a/0xe00
[ 621.423168] handle_pte_fault+0xc6/0x260
[ 621.423182] do_user_addr_fault+0x1c5/0x660
[ 621.423187] exc_page_fault+0x6e/0x160
This tells you the allocation path. Here: a page fault (exc_page_fault) triggered an anonymous page allocation (handle_pte_fault) that entered the slow path and hit OOM. The most common paths are:
| Stack signature | Cause |
|---|---|
do_anonymous_page / handle_pte_fault |
Process touching new heap/stack memory |
do_read_fault / filemap_fault |
Reading a file that needed a new page cache page |
__alloc_pages without fault path |
Kernel subsystem allocating (slab, networking, etc.) |
Memory state
[ 621.423200] active_anon:1932541 inactive_anon:12038 isolated_anon:0
[ 621.423200] active_file:524 inactive_file:618 isolated_file:0
[ 621.423200] unevictable:0 dirty:0 writeback:0
[ 621.423200] slab_reclaimable:6592 slab_unreclaimable:11840
[ 621.423200] mapped:1296 shmem:396 pagetables:7912
[ 621.423200] free:3248 free_pcp:112 free_cma:0
All values are in pages (typically 4 KB each). Key things to check:
| Field | This example | What it tells you |
|---|---|---|
active_anon |
1,932,541 (~7.4 GB) | Anonymous memory dominates — an application is consuming most RAM |
active_file + inactive_file |
1,142 (~4.5 MB) | Page cache almost completely evicted — no room for file data |
free |
3,248 (~12.7 MB) | Critically low |
slab_unreclaimable |
11,840 (~46 MB) | Kernel memory that cannot be freed |
Per-zone watermarks
Free (11,984 kB) is barely above min watermark (11,408 kB). This shows the system had exhausted reclaim — OOM fires when an allocation fails after the kernel has already attempted direct reclaim and cannot free enough pages to satisfy the request.
all_unreclaimable? yes means the kernel has given up trying to reclaim from this node.
Swap state
| Situation | Meaning |
|---|---|
Total swap = 0kB |
No swap configured — anonymous pages cannot be reclaimed at all |
Free swap = 0kB with nonzero Total |
Swap exists but is full — all options exhausted |
Free swap has space |
OOM despite available swap — likely a cgroup limit or GFP constraint |
Task table
[ 621.423260] Tasks state (memory values in pages):
[ 621.423262] [ pid ] uid tgid total_vm rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
[ 621.423268] [ 567] 0 567 13876 1120 460 660 0 81920 0 0 systemd-journal
[ 621.423275] [ 601] 0 601 5765 948 512 436 0 69632 0 -1000 systemd-udevd
[ 621.423296] [ 8340] 1000 8340 1956284 1920568 1920012 460 96 15466496 0 0 python3
| Column | Meaning |
|---|---|
total_vm |
Virtual address space (pages) |
rss |
Resident pages (actually in RAM) |
rss_anon |
Anonymous resident pages (heap, stack) |
rss_file |
File-backed resident pages (libraries, mmap'd files) |
rss_shmem |
Shared memory resident pages |
pgtables_bytes |
Page table memory for this process |
swapents |
Pages in swap |
oom_score_adj |
Manual OOM priority (-1000 = immune, +1000 = kill first) |
In this example, PID 8340 (python3) has 1,920,568 resident pages (~7.3 GB), almost entirely anonymous. It is the clear outlier.
Kill line
[ 621.423315] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),global_oom,task_memcg=/user.slice/...,task=python3,pid=8340,uid=1000
[ 621.423320] Out of memory: Killed process 8340 (python3) total-vm:7825136kB, anon-rss:7680048kB, file-rss:1840kB, shmem-rss:384kB, UID:1000 pgtables:15104kB oom_score_adj:0
| Field | Meaning |
|---|---|
constraint=CONSTRAINT_NONE |
System-wide OOM (not cpuset, NUMA, or cgroup constrained) |
global_oom |
Not a memcg-scoped OOM |
Out of memory: prefix |
System-wide (vs Memory cgroup out of memory: for cgroup OOM) |
anon-rss:7680048kB |
The process's anonymous memory (the actual consumption) |
The Scoring Algorithm
The victim is selected by oom_badness():
| Component | What it counts |
|---|---|
| RSS | rss_anon + rss_file + rss_shmem — resident pages |
| Swap entries | Pages this process has in swap |
| Page tables | mm_pgtables_bytes / PAGE_SIZE |
| Adjustment | oom_score_adj * totalpages / 1000 |
Special cases:
- oom_score_adj = -1000: process is immune (never selected)
- oom_score_adj = 1000: adds totalpages to score (guaranteed victim)
- Kernel threads and init (PID 1) are never killed
The visible score in /proc/<pid>/oom_score is a normalized version of this calculation, computed on read.
Cgroup OOM vs System-Wide OOM
How to tell the difference
| System-wide | Cgroup | |
|---|---|---|
| Kill prefix | Out of memory: |
Memory cgroup out of memory: |
| Constraint | CONSTRAINT_NONE |
CONSTRAINT_MEMCG |
| Memory dump | Full Mem-Info: with zone/node stats |
Cgroup memory stats (format differs between cgroup v1 and v2) |
| Task table | All processes | Only processes in the cgroup |
totalpages for scoring |
System RAM + swap | Cgroup's memory.max |
Cgroup OOM example (key differences)
Cgroup v2 (default on modern kernels, via mem_cgroup_print_oom_meminfo() in mm/memcontrol.c):
# Cgroup memory stats
[ 891.112245] Memory cgroup stats for /myapp.slice/myapp.service:
[ 891.112248] anon 536346624
[ 891.112248] file 262144
[ 891.112248] kernel 2158592
...
# Constraint shows MEMCG and the cgroup path
[ 891.112285] oom-kill:constraint=CONSTRAINT_MEMCG,...,oom_memcg=/myapp.slice/myapp.service,...
# Kill prefix differs
[ 891.112290] Memory cgroup out of memory: Killed process 12001 (python3) ...
Cgroup v1 shows a different format with usage, limit, and failcnt fields. On cgroup v2, the memory stats are printed as counter names with byte values.
memory.oom.group
When memory.oom.group = 1 is set on a cgroup, the OOM killer kills all processes in that cgroup, not just one. An additional line appears:
[ 891.112295] Tasks in /myapp.slice/myapp.service are going to be killed due to memory.oom.group set
This walks up the cgroup hierarchy to the highest ancestor with oom_group set, then kills every process in the subtree (except those with oom_score_adj = -1000).
Common OOM Patterns
Pattern 1: Memory leak
Signature: One process has dramatically larger RSS than everything else. rss_anon dominates.
[ 12340] 1000 12340 4521984 3906712 3906200 440 72 31457280 0 0 leaky-app
[ 567] 0 567 13876 1120 460 660 0 81920 0 0 systemd-journal
Diagnosis: leaky-app at ~14.9 GB RSS while everything else is under 5 MB. Unbounded heap growth.
Action: Fix the leak. As a temporary measure, set a cgroup memory.max to contain it.
Pattern 2: Fork bomb
Signature: Hundreds of identical small processes. High pagetables and slab_unreclaimable.
[ 15001] 1000 15001 2710 892 504 388 0 61440 0 0 bomb.sh
[ 15002] 1000 15002 2710 892 504 388 0 61440 0 0 bomb.sh
... (hundreds more)
Diagnosis: Per-process kernel overhead (stacks, page tables, task_struct) adds up. OOM may kill just one, and the fork bomb recreates it.
Action: Use ulimit -u (max processes), cgroup pids.max, or systemd TasksMax=.
Pattern 3: Legitimate workload, not enough RAM
Signature: Multiple processes with reasonable RSS, no single outlier. active_file + inactive_file near zero. No swap or swap is full.
[ 1200] 0 1200 524288 262100 261800 220 80 2097152 0 0 postgres
[ 1201] 0 1201 524288 261984 261700 200 84 2097152 0 0 postgres
[ 2001] 1000 2001 131072 65500 65200 240 60 524288 0 0 redis-server
Diagnosis: Every process is doing real work. The system genuinely needs more memory.
Action: Add RAM, add swap, or distribute workloads across hosts.
Pattern 4: Kernel slab leak
Signature: slab_unreclaimable is very large. Task table RSS does not account for most used memory.
Diagnosis: Kernel data structures are leaking. The gap between total RAM and sum of all process RSS is in kernel slab.
Action: Use slabtop or /proc/slabinfo to identify which cache is growing. Common offenders: dentry, inode_cache, NFS inode caches.
Five Questions to Answer
When reading any OOM log, answer these:
| Question | Where to look |
|---|---|
| System-wide or cgroup OOM? | Kill prefix and constraint field |
| What consumed the most memory? | Task table, largest rss |
| Was swap available? | Free swap / Total swap lines |
| Was the system thrashing before OOM? | active_file + inactive_file near zero, all_unreclaimable? yes |
| Could it have been prevented? | See table below |
| Finding | Prevention |
|---|---|
| No swap configured | Add swap for emergency breathing room |
| One process with massive RSS | Cgroup memory.max, or fix the leak |
oom_score_adj=-1000 on non-critical processes |
Audit OOM protection — do not protect everything |
slab_unreclaimable dominating |
Investigate kernel slab leak |
| Page cache near zero, I/O workload | Add RAM — the system genuinely needs more |
Low min_free_kbytes |
Increase vm.min_free_kbytes to trigger kswapd earlier |
Try It Yourself
# View past OOM events
dmesg | grep -i "out of memory"
dmesg | grep "oom-kill"
dmesg | grep "oom_kill"
# Check current OOM scores for all processes
for pid in /proc/[0-9]*/; do
score=$(cat ${pid}oom_score 2>/dev/null)
adj=$(cat ${pid}oom_score_adj 2>/dev/null)
name=$(cat ${pid}comm 2>/dev/null)
[ -n "$score" ] && echo "$score $adj $name"
done | sort -rn | head -20
# Monitor OOM kill events via tracing (oom:mark_victim traces actual kills)
echo 1 > /sys/kernel/debug/tracing/events/oom/mark_victim/enable
cat /sys/kernel/debug/tracing/trace_pipe
# Trigger a controlled OOM in a cgroup (for testing)
mkdir -p /sys/fs/cgroup/test-oom
echo 50M > /sys/fs/cgroup/test-oom/memory.max
echo $$ > /sys/fs/cgroup/test-oom/cgroup.procs
stress --vm 1 --vm-bytes 100M --timeout 10s
Key Source Files
| File | What it contains |
|---|---|
mm/oom_kill.c |
Scoring, selection, killing, log output |
mm/memcontrol.c |
Cgroup OOM: mem_cgroup_print_oom_meminfo(), mem_cgroup_get_oom_group() |
mm/show_mem.c |
__show_mem() producing the Mem-Info dump |
mm/page_alloc.c |
__alloc_pages_may_oom() entry point, zone stats |
Further Reading
- Running out of memory — the full OOM lifecycle (watermarks through reclaim to OOM)
- /proc/meminfo — understanding the memory state fields
- /proc/vmstat — reclaim and pressure counters
- Page reclaim — what happens before OOM fires
- Memory cgroups — cgroup memory limits and OOM behavior
- vm sysctl docs — tunable parameters