Memory Sysctl Tuning Reference
Compact reference for all memory-related sysctls under /proc/sys/vm/. Each entry links to the kernel sysctl documentation.
Read or write any value at runtime:
# Read
sysctl vm.swappiness
# Write (non-persistent)
sysctl -w vm.swappiness=10
# Persistent (survives reboot)
echo "vm.swappiness = 10" >> /etc/sysctl.d/99-tuning.conf
sysctl --system
Reclaim
Controls how aggressively the kernel reclaims memory and manages free-page watermarks.
| Sysctl | Default | Description | When to change |
|---|---|---|---|
vm.swappiness |
60 | Weight for reclaiming anonymous pages vs page cache (0-200) | Lower for latency-sensitive workloads that prefer dropping cache over swapping; raise for memory-overcommitted hosts |
vm.min_free_kbytes |
Varies (scales with RAM) | Minimum KB of free memory the kernel keeps reserved | Raise on systems that hit direct reclaim stalls or need burst allocations (e.g., high-speed networking) |
vm.watermark_scale_factor |
10 | Gap between min/low/high watermarks as fraction of zone size (units of 0.01%) | Raise to trigger kswapd earlier, reducing direct reclaim; useful for bursty allocation patterns |
vm.watermark_boost_factor |
15000 | Temporary watermark boost after memory fragmentation events (units of 0.01%) | Lower or set to 0 on systems where boosted reclaim causes unnecessary swapping |
vm.vfs_cache_pressure |
100 | Tendency to reclaim dentry/inode caches vs page cache (0 = never, 100 = fair, >100 = aggressive) | Lower for file-server workloads with many small files; raise when dentry/inode caches consume too much memory |
Dirty Pages
Controls when and how often dirty (modified) page cache is written back to disk.
| Sysctl | Default | Description | When to change |
|---|---|---|---|
vm.dirty_ratio |
20 | Percentage of total memory at which a writing process is forced to start writeback | Lower for latency-sensitive I/O; raise for throughput on fast storage |
vm.dirty_background_ratio |
10 | Percentage of total memory at which background writeback (per-device writeback threads) kicks in | Lower to keep less dirty data in memory (safer on power loss); raise on fast storage to batch more writes |
vm.dirty_expire_centisecs |
3000 | Age (in centiseconds) at which dirty data becomes eligible for writeback | Lower for data safety; raise to allow more write coalescing |
vm.dirty_writeback_centisecs |
500 | Interval (in centiseconds) between writeback thread wake-ups | Lower for more frequent flushes; raise (or set to 0 to disable) on battery-powered devices to reduce disk wake-ups |
Tip
Use dirty_bytes / dirty_background_bytes instead of the ratio variants when you need an absolute cap that does not scale with RAM.
Overcommit
Controls the kernel's memory overcommit policy. See also Memory Overcommit.
| Sysctl | Default | Description | When to change |
|---|---|---|---|
vm.overcommit_memory |
0 | Policy: 0 = heuristic, 1 = always allow, 2 = strict limit | Set to 2 on systems where OOM kills are unacceptable (databases, embedded); set to 1 for certain HPC or container setups |
vm.overcommit_ratio |
50 | When overcommit_memory=2: commit limit = swap + RAM * ratio / 100 |
Raise when applications legitimately need more virtual memory; only effective with mode 2 |
vm.overcommit_kbytes |
0 | When overcommit_memory=2: absolute overcommit limit in KB (overrides ratio if nonzero) |
Use instead of ratio when you need a precise byte-level commit limit |
Huge Pages
Controls static (hugetlbfs) huge page allocation. For transparent huge pages, see THP.
| Sysctl | Default | Description | When to change |
|---|---|---|---|
vm.nr_hugepages |
0 | Number of persistent huge pages to pre-allocate | Set to the number of huge pages your application requires (e.g., DPDK, large databases) |
vm.nr_overcommit_hugepages |
0 | Additional huge pages that can be allocated on demand beyond nr_hugepages |
Set when applications may need burst huge-page capacity but you do not want to pin all memory upfront |
vm.hugetlb_shm_group |
0 | GID allowed to create SysV shared memory segments backed by huge pages | Set to the group ID of unprivileged users that need huge-page shared memory (e.g., database service accounts) |
NUMA
Tuning knobs for Non-Uniform Memory Access systems. See also NUMA.
| Sysctl | Default | Description | When to change |
|---|---|---|---|
vm.zone_reclaim_mode |
0 | Bitmask controlling whether the kernel reclaims local-node memory before allocating remotely (1 = enable zone reclaim, 2 = writeback, 4 = swap) | Enable on large NUMA systems where local-memory latency matters more than cache retention; leave at 0 for most workloads |
vm.numa_stat |
1 | Enable per-node NUMA hit/miss/foreign statistics in /proc/vmstat |
Disable (set to 0) on large NUMA machines where the per-update overhead of these counters is measurable |
OOM
Controls Out-Of-Memory killer behavior. See also Running out of memory.
| Sysctl | Default | Description | When to change |
|---|---|---|---|
vm.oom_kill_allocating_task |
0 | When set to 1, kill the task that triggered OOM instead of scanning for the worst offender | Enable on systems where scanning is too slow or you want deterministic OOM behavior |
vm.oom_dump_tasks |
1 | Dump per-task memory info to the kernel log on OOM | Disable on systems with thousands of tasks where the dump itself causes problems |
vm.panic_on_oom |
0 | Trigger a kernel panic instead of invoking the OOM killer (0 = off, 1 = panic, 2 = panic even for cgroup OOM) | Enable in clusters where a dead node is preferable to a degraded one (lets a watchdog or cluster manager restart the machine) |
Miscellaneous
| Sysctl | Default | Description | When to change |
|---|---|---|---|
vm.max_map_count |
65530 | Maximum number of VMAs a process may have | Raise for applications with many mappings (e.g., JVMs, Elasticsearch, mmap-heavy databases) |
vm.mmap_min_addr |
0 (kernel default; most distros set 65536) | Lowest virtual address a user-space process is allowed to mmap | Lower only if legacy software requires mapping at address zero; keep high for NULL-pointer dereference protection |
vm.compact_memory |
Write-only | Writing 1 triggers synchronous memory compaction across all zones | Use before allocating huge pages on a fragmented system; avoid in production hot paths |
vm.drop_caches |
Write-only | Writing 1 = free page cache, 2 = free dentries/inodes, 3 = both | Use only for benchmarking or debugging; not recommended in production as the kernel manages caches effectively on its own |
Quick Tips
- Check current values:
sysctl -a | grep vm.lists everyvm.*parameter and its value. - Monitor effects: watch
/proc/vmstat,/proc/meminfo, and per-zone stats in/proc/zoneinfoafter changing a sysctl. - Persist changes: place tuning in
/etc/sysctl.d/*.conffiles rather than editing/etc/sysctl.confdirectly. - Container awareness: inside a container, many
vm.*sysctls are host-global and require privileged access. Memory cgroup knobs (see memcg) are the per-container equivalent.