Reading an OOM log

How to read and diagnose Out-of-Memory killer messages

What does an OOM log look like?

When the kernel runs out of memory and cannot reclaim enough pages, it invokes the OOM killer. The log is generated by functions in mm/oom_kill.c, primarily dump_header(), dump_tasks(), and __oom_kill_process().

The log has four sections:

Trigger line — who invoked OOM and why
Memory state dump — system or cgroup memory statistics
Task table — all processes with their memory usage
Kill line — which process was selected and killed

Annotated Example: System-Wide OOM

[  621.423101] python3 invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0

Field	Meaning
`python3`	The process whose allocation failed (not necessarily the victim)
`gfp_mask=0x140cca`	Allocation flags — `GFP_HIGHUSER_MOVABLE` means a userspace page allocation
`order=0`	Requested a single page (2^0 = 1 page). Higher orders mean contiguous allocation.
`oom_score_adj=0`	OOM adjustment of the triggering process (not the victim)

Stack trace

[  621.423118]  <TASK>
[  621.423137]  oom_kill_process+0x101/0x170
[  621.423143]  out_of_memory+0x116/0x570
[  621.423149]  __alloc_pages_slowpath+0xc9a/0xe00
[  621.423168]  handle_pte_fault+0xc6/0x260
[  621.423182]  do_user_addr_fault+0x1c5/0x660
[  621.423187]  exc_page_fault+0x6e/0x160

This tells you the allocation path. Here: a page fault (exc_page_fault) triggered an anonymous page allocation (handle_pte_fault) that entered the slow path and hit OOM. The most common paths are:

Stack signature	Cause
`do_anonymous_page` / `handle_pte_fault`	Process touching new heap/stack memory
`do_read_fault` / `filemap_fault`	Reading a file that needed a new page cache page
`__alloc_pages` without fault path	Kernel subsystem allocating (slab, networking, etc.)

Memory state

[  621.423200] active_anon:1932541 inactive_anon:12038 isolated_anon:0
[  621.423200]  active_file:524 inactive_file:618 isolated_file:0
[  621.423200]  unevictable:0 dirty:0 writeback:0
[  621.423200]  slab_reclaimable:6592 slab_unreclaimable:11840
[  621.423200]  mapped:1296 shmem:396 pagetables:7912
[  621.423200]  free:3248 free_pcp:112 free_cma:0

All values are in pages (typically 4 KB each). Key things to check:

Field	This example	What it tells you
`active_anon`	1,932,541 (~7.4 GB)	Anonymous memory dominates — an application is consuming most RAM
`active_file + inactive_file`	1,142 (~4.5 MB)	Page cache almost completely evicted — no room for file data
`free`	3,248 (~12.7 MB)	Critically low
`slab_unreclaimable`	11,840 (~46 MB)	Kernel memory that cannot be freed

Per-zone watermarks

[  621.423225] Node 0 Normal free:11984kB min:11408kB low:14260kB high:17112kB

Free (11,984 kB) is barely above min watermark (11,408 kB). This shows the system had exhausted reclaim — OOM fires when an allocation fails after the kernel has already attempted direct reclaim and cannot free enough pages to satisfy the request.

[  621.423225] ... all_unreclaimable? yes

all_unreclaimable? yes means the kernel has given up trying to reclaim from this node.

Swap state

[  621.423252] Free swap  = 0kB
[  621.423253] Total swap = 0kB

Situation	Meaning
`Total swap = 0kB`	No swap configured — anonymous pages cannot be reclaimed at all
`Free swap = 0kB` with nonzero Total	Swap exists but is full — all options exhausted
`Free swap` has space	OOM despite available swap — likely a cgroup limit or GFP constraint

Task table

[  621.423260] Tasks state (memory values in pages):
[  621.423262] [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
[  621.423268] [    567]     0   567    13876     1120      460      660         0       81920        0             0 systemd-journal
[  621.423275] [    601]     0   601     5765      948      512      436         0       69632        0         -1000 systemd-udevd
[  621.423296] [   8340]  1000  8340  1956284  1920568  1920012      460        96    15466496        0             0 python3

Column	Meaning
`total_vm`	Virtual address space (pages)
`rss`	Resident pages (actually in RAM)
`rss_anon`	Anonymous resident pages (heap, stack)
`rss_file`	File-backed resident pages (libraries, mmap'd files)
`rss_shmem`	Shared memory resident pages
`pgtables_bytes`	Page table memory for this process
`swapents`	Pages in swap
`oom_score_adj`	Manual OOM priority (-1000 = immune, +1000 = kill first)

In this example, PID 8340 (python3) has 1,920,568 resident pages (~7.3 GB), almost entirely anonymous. It is the clear outlier.

Kill line

[  621.423315] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),global_oom,task_memcg=/user.slice/...,task=python3,pid=8340,uid=1000
[  621.423320] Out of memory: Killed process 8340 (python3) total-vm:7825136kB, anon-rss:7680048kB, file-rss:1840kB, shmem-rss:384kB, UID:1000 pgtables:15104kB oom_score_adj:0

Field	Meaning
`constraint=CONSTRAINT_NONE`	System-wide OOM (not cpuset, NUMA, or cgroup constrained)
`global_oom`	Not a memcg-scoped OOM
`Out of memory:` prefix	System-wide (vs `Memory cgroup out of memory:` for cgroup OOM)
`anon-rss:7680048kB`	The process's anonymous memory (the actual consumption)

The Scoring Algorithm

The victim is selected by oom_badness():

score = RSS + swap entries + page table pages
      + (oom_score_adj * totalpages / 1000)

Component	What it counts
RSS	`rss_anon + rss_file + rss_shmem` — resident pages
Swap entries	Pages this process has in swap
Page tables	`mm_pgtables_bytes / PAGE_SIZE`
Adjustment	`oom_score_adj * totalpages / 1000`

Special cases: - oom_score_adj = -1000: process is immune (never selected) - oom_score_adj = 1000: adds totalpages to score (guaranteed victim) - Kernel threads and init (PID 1) are never killed

The visible score in /proc/<pid>/oom_score is a normalized version of this calculation, computed on read.

Cgroup OOM vs System-Wide OOM

How to tell the difference

	System-wide	Cgroup
Kill prefix	`Out of memory:`	`Memory cgroup out of memory:`
Constraint	`CONSTRAINT_NONE`	`CONSTRAINT_MEMCG`
Memory dump	Full `Mem-Info:` with zone/node stats	Cgroup memory stats (format differs between cgroup v1 and v2)
Task table	All processes	Only processes in the cgroup
`totalpages` for scoring	System RAM + swap	Cgroup's `memory.max`

Cgroup OOM example (key differences)

Cgroup v2 (default on modern kernels, via mem_cgroup_print_oom_meminfo() in mm/memcontrol.c):

# Cgroup memory stats
[  891.112245] Memory cgroup stats for /myapp.slice/myapp.service:
[  891.112248] anon 536346624
[  891.112248] file 262144
[  891.112248] kernel 2158592
...

# Constraint shows MEMCG and the cgroup path
[  891.112285] oom-kill:constraint=CONSTRAINT_MEMCG,...,oom_memcg=/myapp.slice/myapp.service,...

# Kill prefix differs
[  891.112290] Memory cgroup out of memory: Killed process 12001 (python3) ...

Cgroup v1 shows a different format with usage, limit, and failcnt fields. On cgroup v2, the memory stats are printed as counter names with byte values.

memory.oom.group

When memory.oom.group = 1 is set on a cgroup, the OOM killer kills all processes in that cgroup, not just one. An additional line appears:

[  891.112295] Tasks in /myapp.slice/myapp.service are going to be killed due to memory.oom.group set

This walks up the cgroup hierarchy to the highest ancestor with oom_group set, then kills every process in the subtree (except those with oom_score_adj = -1000).

Common OOM Patterns

Pattern 1: Memory leak

Signature: One process has dramatically larger RSS than everything else. rss_anon dominates.

[  12340]  1000 12340  4521984  3906712  3906200      440        72    31457280        0             0 leaky-app
[    567]     0   567    13876     1120      460      660         0       81920        0             0 systemd-journal

Diagnosis: leaky-app at ~14.9 GB RSS while everything else is under 5 MB. Unbounded heap growth.

Action: Fix the leak. As a temporary measure, set a cgroup memory.max to contain it.

Pattern 2: Fork bomb

Signature: Hundreds of identical small processes. High pagetables and slab_unreclaimable.

[  15001]  1000 15001     2710      892      504      388         0       61440        0             0 bomb.sh
[  15002]  1000 15002     2710      892      504      388         0       61440        0             0 bomb.sh
... (hundreds more)

Diagnosis: Per-process kernel overhead (stacks, page tables, task_struct) adds up. OOM may kill just one, and the fork bomb recreates it.

Action: Use ulimit -u (max processes), cgroup pids.max, or systemd TasksMax=.

Pattern 3: Legitimate workload, not enough RAM

Signature: Multiple processes with reasonable RSS, no single outlier. active_file + inactive_file near zero. No swap or swap is full.

[   1200]     0  1200   524288   262100   261800      220        80     2097152        0             0 postgres
[   1201]     0  1201   524288   261984   261700      200        84     2097152        0             0 postgres
[   2001]  1000  2001   131072    65500    65200      240        60      524288        0             0 redis-server

Diagnosis: Every process is doing real work. The system genuinely needs more memory.

Action: Add RAM, add swap, or distribute workloads across hosts.

Pattern 4: Kernel slab leak

Signature: slab_unreclaimable is very large. Task table RSS does not account for most used memory.

slab_reclaimable:8192 slab_unreclaimable:524288   <- 2 GB unreclaimable slab

Diagnosis: Kernel data structures are leaking. The gap between total RAM and sum of all process RSS is in kernel slab.

Action: Use slabtop or /proc/slabinfo to identify which cache is growing. Common offenders: dentry, inode_cache, NFS inode caches.

Five Questions to Answer

When reading any OOM log, answer these:

Question	Where to look
System-wide or cgroup OOM?	Kill prefix and constraint field
What consumed the most memory?	Task table, largest `rss`
Was swap available?	`Free swap` / `Total swap` lines
Was the system thrashing before OOM?	`active_file + inactive_file` near zero, `all_unreclaimable? yes`
Could it have been prevented?	See table below

Finding	Prevention
No swap configured	Add swap for emergency breathing room
One process with massive RSS	Cgroup `memory.max`, or fix the leak
`oom_score_adj=-1000` on non-critical processes	Audit OOM protection — do not protect everything
`slab_unreclaimable` dominating	Investigate kernel slab leak
Page cache near zero, I/O workload	Add RAM — the system genuinely needs more
Low `min_free_kbytes`	Increase `vm.min_free_kbytes` to trigger kswapd earlier

Try It Yourself

# View past OOM events
dmesg | grep -i "out of memory"
dmesg | grep "oom-kill"
dmesg | grep "oom_kill"

# Check current OOM scores for all processes
for pid in /proc/[0-9]*/; do
    score=$(cat ${pid}oom_score 2>/dev/null)
    adj=$(cat ${pid}oom_score_adj 2>/dev/null)
    name=$(cat ${pid}comm 2>/dev/null)
    [ -n "$score" ] && echo "$score $adj $name"
done | sort -rn | head -20

# Monitor OOM kill events via tracing (oom:mark_victim traces actual kills)
echo 1 > /sys/kernel/debug/tracing/events/oom/mark_victim/enable
cat /sys/kernel/debug/tracing/trace_pipe

# Trigger a controlled OOM in a cgroup (for testing)
mkdir -p /sys/fs/cgroup/test-oom
echo 50M > /sys/fs/cgroup/test-oom/memory.max
echo $$ > /sys/fs/cgroup/test-oom/cgroup.procs
stress --vm 1 --vm-bytes 100M --timeout 10s

Key Source Files

File	What it contains
`mm/oom_kill.c`	Scoring, selection, killing, log output
`mm/memcontrol.c`	Cgroup OOM: `mem_cgroup_print_oom_meminfo()`, `mem_cgroup_get_oom_group()`
`mm/show_mem.c`	`__show_mem()` producing the Mem-Info dump
`mm/page_alloc.c`	`__alloc_pages_may_oom()` entry point, zone stats