Skip to content

Memory Sysctl Tuning Reference

Compact reference for all memory-related sysctls under /proc/sys/vm/. Each entry links to the kernel sysctl documentation.

Read or write any value at runtime:

# Read
sysctl vm.swappiness
# Write (non-persistent)
sysctl -w vm.swappiness=10
# Persistent (survives reboot)
echo "vm.swappiness = 10" >> /etc/sysctl.d/99-tuning.conf
sysctl --system

Reclaim

Controls how aggressively the kernel reclaims memory and manages free-page watermarks.

Sysctl Default Description When to change
vm.swappiness 60 Weight for reclaiming anonymous pages vs page cache (0-200) Lower for latency-sensitive workloads that prefer dropping cache over swapping; raise for memory-overcommitted hosts
vm.min_free_kbytes Varies (scales with RAM) Minimum KB of free memory the kernel keeps reserved Raise on systems that hit direct reclaim stalls or need burst allocations (e.g., high-speed networking)
vm.watermark_scale_factor 10 Gap between min/low/high watermarks as fraction of zone size (units of 0.01%) Raise to trigger kswapd earlier, reducing direct reclaim; useful for bursty allocation patterns
vm.watermark_boost_factor 15000 Temporary watermark boost after memory fragmentation events (units of 0.01%) Lower or set to 0 on systems where boosted reclaim causes unnecessary swapping
vm.vfs_cache_pressure 100 Tendency to reclaim dentry/inode caches vs page cache (0 = never, 100 = fair, >100 = aggressive) Lower for file-server workloads with many small files; raise when dentry/inode caches consume too much memory

Dirty Pages

Controls when and how often dirty (modified) page cache is written back to disk.

Sysctl Default Description When to change
vm.dirty_ratio 20 Percentage of total memory at which a writing process is forced to start writeback Lower for latency-sensitive I/O; raise for throughput on fast storage
vm.dirty_background_ratio 10 Percentage of total memory at which background writeback (per-device writeback threads) kicks in Lower to keep less dirty data in memory (safer on power loss); raise on fast storage to batch more writes
vm.dirty_expire_centisecs 3000 Age (in centiseconds) at which dirty data becomes eligible for writeback Lower for data safety; raise to allow more write coalescing
vm.dirty_writeback_centisecs 500 Interval (in centiseconds) between writeback thread wake-ups Lower for more frequent flushes; raise (or set to 0 to disable) on battery-powered devices to reduce disk wake-ups

Tip

Use dirty_bytes / dirty_background_bytes instead of the ratio variants when you need an absolute cap that does not scale with RAM.


Overcommit

Controls the kernel's memory overcommit policy. See also Memory Overcommit.

Sysctl Default Description When to change
vm.overcommit_memory 0 Policy: 0 = heuristic, 1 = always allow, 2 = strict limit Set to 2 on systems where OOM kills are unacceptable (databases, embedded); set to 1 for certain HPC or container setups
vm.overcommit_ratio 50 When overcommit_memory=2: commit limit = swap + RAM * ratio / 100 Raise when applications legitimately need more virtual memory; only effective with mode 2
vm.overcommit_kbytes 0 When overcommit_memory=2: absolute overcommit limit in KB (overrides ratio if nonzero) Use instead of ratio when you need a precise byte-level commit limit

Huge Pages

Controls static (hugetlbfs) huge page allocation. For transparent huge pages, see THP.

Sysctl Default Description When to change
vm.nr_hugepages 0 Number of persistent huge pages to pre-allocate Set to the number of huge pages your application requires (e.g., DPDK, large databases)
vm.nr_overcommit_hugepages 0 Additional huge pages that can be allocated on demand beyond nr_hugepages Set when applications may need burst huge-page capacity but you do not want to pin all memory upfront
vm.hugetlb_shm_group 0 GID allowed to create SysV shared memory segments backed by huge pages Set to the group ID of unprivileged users that need huge-page shared memory (e.g., database service accounts)

NUMA

Tuning knobs for Non-Uniform Memory Access systems. See also NUMA.

Sysctl Default Description When to change
vm.zone_reclaim_mode 0 Bitmask controlling whether the kernel reclaims local-node memory before allocating remotely (1 = enable zone reclaim, 2 = writeback, 4 = swap) Enable on large NUMA systems where local-memory latency matters more than cache retention; leave at 0 for most workloads
vm.numa_stat 1 Enable per-node NUMA hit/miss/foreign statistics in /proc/vmstat Disable (set to 0) on large NUMA machines where the per-update overhead of these counters is measurable

OOM

Controls Out-Of-Memory killer behavior. See also Running out of memory.

Sysctl Default Description When to change
vm.oom_kill_allocating_task 0 When set to 1, kill the task that triggered OOM instead of scanning for the worst offender Enable on systems where scanning is too slow or you want deterministic OOM behavior
vm.oom_dump_tasks 1 Dump per-task memory info to the kernel log on OOM Disable on systems with thousands of tasks where the dump itself causes problems
vm.panic_on_oom 0 Trigger a kernel panic instead of invoking the OOM killer (0 = off, 1 = panic, 2 = panic even for cgroup OOM) Enable in clusters where a dead node is preferable to a degraded one (lets a watchdog or cluster manager restart the machine)

Miscellaneous

Sysctl Default Description When to change
vm.max_map_count 65530 Maximum number of VMAs a process may have Raise for applications with many mappings (e.g., JVMs, Elasticsearch, mmap-heavy databases)
vm.mmap_min_addr 0 (kernel default; most distros set 65536) Lowest virtual address a user-space process is allowed to mmap Lower only if legacy software requires mapping at address zero; keep high for NULL-pointer dereference protection
vm.compact_memory Write-only Writing 1 triggers synchronous memory compaction across all zones Use before allocating huge pages on a fragmented system; avoid in production hot paths
vm.drop_caches Write-only Writing 1 = free page cache, 2 = free dentries/inodes, 3 = both Use only for benchmarking or debugging; not recommended in production as the kernel manages caches effectively on its own

Quick Tips

  • Check current values: sysctl -a | grep vm. lists every vm.* parameter and its value.
  • Monitor effects: watch /proc/vmstat, /proc/meminfo, and per-zone stats in /proc/zoneinfo after changing a sysctl.
  • Persist changes: place tuning in /etc/sysctl.d/*.conf files rather than editing /etc/sysctl.conf directly.
  • Container awareness: inside a container, many vm.* sysctls are host-global and require privileged access. Memory cgroup knobs (see memcg) are the per-container equivalent.