Swap
Extending memory to disk
What Is Swap?
Swap allows the kernel to move infrequently used pages from RAM to disk, freeing memory for active use. When those pages are needed again, they're read back from swap.
RAM Full, need more memory:
┌─────────────────────────────────────────┐
│ RAM: [Active] [Active] [Inactive] [Active] │
└─────────────────────────────────────────┘
│
▼ (swap out)
┌─────────────────────────────────────────┐
│ RAM: [Active] [Active] [FREE] [Active] │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Swap: [Inactive page] │
└─────────────────────────────────────────┘
Later, page accessed:
│
▼ (swap in)
┌─────────────────────────────────────────┐
│ RAM: [Active] [Page back] [X] [Active] │
└─────────────────────────────────────────┘
Swap Types
Swap Partition
Dedicated disk partition for swap:
# Create swap partition (during install or with fdisk)
mkswap /dev/sda2
swapon /dev/sda2
# View active swap
swapon --show
# NAME TYPE SIZE USED PRIO
# /dev/sda2 partition 8G 1.2G -2
Swap File
Regular file used as swap:
# Create a 4GB swap file (modern method)
fallocate -l 4G /swapfile
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile
# Make permanent in /etc/fstab:
# /swapfile none swap sw 0 0
Note for btrfs (COW filesystems): The +C (no-copy-on-write) attribute must be set before the file has data:
# Method 1: Create empty file, set +C, then allocate
touch /swapfile
chattr +C /swapfile
fallocate -l 4G /swapfile
chmod 600 /swapfile && mkswap /swapfile && swapon /swapfile
# Method 2: Set +C on parent directory (new files inherit it)
mkdir /swap && chattr +C /swap
fallocate -l 4G /swap/swapfile
# ... then mkswap, swapon as usual
zswap (Compressed Swap Cache)
Compresses pages before writing to disk, often avoiding disk I/O entirely:
Page to swap out
│
v
Compress in RAM (zswap pool)
│
├── Fits in pool? ──► Store compressed (no disk I/O)
│
└── Pool full? ──► Write to backing swap device
# Enable zswap
echo 1 > /sys/module/zswap/parameters/enabled
# Configure compressor and pool
echo lz4 > /sys/module/zswap/parameters/compressor
echo 20 > /sys/module/zswap/parameters/max_pool_percent
# View statistics
grep -r . /sys/kernel/debug/zswap/ 2>/dev/null
zram (Compressed RAM Disk)
RAM-based block device with compression - swap without disk:
# Load module
modprobe zram
# Set size (compression makes effective size larger)
echo 4G > /sys/block/zram0/disksize
# Use as swap
mkswap /dev/zram0
swapon /dev/zram0 -p 100 # Higher priority than disk swap
Swap Architecture
Swap Areas
Linux supports multiple swap areas with priorities:
# View swap areas
cat /proc/swaps
# Filename Type Size Used Priority
# /dev/sda2 partition 8388604 1234560 -2
# /dev/zram0 partition 4194300 567890 100
Higher priority swap is used first. Equal priorities are striped (round-robin).
Swap Cache
Recently swapped-in pages stay in swap cache briefly:
Swap cache:
┌─────────────────────────────────────────┐
│ Page in RAM + still on swap │
│ (if process forks, child can share) │
└─────────────────────────────────────────┘
If a swapped-in page is swapped out again without modification, no disk write needed.
Swap Slots
Swap space is divided into slots (one per page):
/* Each slot tracks one swapped page */
swap_entry_t entry = swp_entry(type, offset);
/* type: which swap area */
/* offset: slot within that area */
Swapping Mechanics
Swap Out (Page to Disk)
Memory pressure
│
v
Select victim page (from inactive list)
│
v
Allocate swap slot
│
v
Write page to swap
│
v
Update PTE: present=0, swap_entry=slot
│
v
Free page frame
Swap In (Disk to Page)
Process accesses swapped page
│
v
Page fault (not present)
│
v
Read swap entry from PTE
│
v
Allocate page frame
│
v
Read from swap into page
│
v
Update PTE: present=1, page_frame=new
│
v
Resume process
Configuration
Swappiness
Controls preference for swapping anonymous pages vs dropping file cache:
cat /proc/sys/vm/swappiness
# 60 (default)
# Lower = prefer dropping cache, avoid swapping
echo 10 > /proc/sys/vm/swappiness
# Higher = more willing to swap
echo 80 > /proc/sys/vm/swappiness
# 0 = swap only to avoid OOM (not never)
Swap Priority
# Set priority when enabling
swapon -p 100 /dev/zram0 # High priority
swapon -p -2 /dev/sda2 # Low priority (default)
# In /etc/fstab:
# /dev/zram0 none swap sw,pri=100 0 0
# /dev/sda2 none swap sw,pri=-2 0 0
Overcommit
# Memory overcommit policy
cat /proc/sys/vm/overcommit_memory
# 0 = heuristic (default) - allow reasonable overcommit
# 1 = always allow - never fail malloc
# 2 = strict - limit to swap + ratio*RAM
# For mode 2, the ratio:
cat /proc/sys/vm/overcommit_ratio
# 50 (default) = swap + 50% of RAM
Monitoring
Swap Usage
# Quick view
free -h
# total used free shared buff/cache available
# Swap: 8.0G 1.2G 6.8G
# Detailed
cat /proc/meminfo | grep -i swap
# SwapCached: 123456 kB (pages in swap and RAM)
# SwapTotal: 8388604 kB
# SwapFree: 7000000 kB
Swap Activity
# Pages swapped in/out
cat /proc/vmstat | grep -E "pswpin|pswpout"
# pswpin - Pages read from swap
# pswpout - Pages written to swap
# Real-time monitoring
vmstat 1
# si = swap in (KB/s)
# so = swap out (KB/s)
Per-Process Swap
# Swap usage per process
cat /proc/<pid>/status | grep -i swap
# VmSwap: 1234 kB
# System-wide total
awk '/VmSwap/{sum+=$2} END {print sum" kB"}' /proc/*/status 2>/dev/null
# Top swap consumers by process
grep VmSwap /proc/*/status 2>/dev/null | sort -k2 -n | tail
zswap Statistics
cat /sys/kernel/debug/zswap/pool_total_size # Compressed size
cat /sys/kernel/debug/zswap/stored_pages # Pages in zswap
cat /sys/kernel/debug/zswap/written_back_pages # Evicted to disk
Evolution
Original Swap (1991)
Basic swap support from the beginning. Single swap area.
Multiple Swap Areas (v1.3)
Support for multiple swap partitions with priorities.
Swap Files (v2.6)
Swap files became as efficient as partitions.
zswap (v3.11, 2013)
Commit: 2b2811178e85 ("zswap: add to mm/") | LKML
Compressed swap cache to reduce disk I/O.
zram Swap (v3.14, 2014)
zram became suitable for swap use, enabling diskless swap.
THP Swap (v4.13, 2017)
Commit: 38d8b4e6bdc8 ("mm, THP, swap: delay splitting THP during swap out") | LKML
Author: Huang Ying
Transparent Huge Pages can now be swapped without splitting first.
Swap-over-NFS (Experimental)
Work ongoing to allow swapping to network storage for diskless systems.
Swap vs No Swap
Arguments for Swap
| Benefit | Explanation |
|---|---|
| OOM prevention | Swap provides buffer before OOM killer |
| Hibernation | Requires swap for suspend-to-disk |
| Idle page eviction | Unused pages can be moved out |
| Overcommit safety | More headroom for memory spikes |
Arguments Against Swap
| Concern | Explanation |
|---|---|
| Latency | Swap is slow, can cause hangs |
| SSD wear | Frequent swapping wears flash |
| Thrashing | Heavy swap = system unusable |
| Memory hiding | Masks memory leaks |
Recommendation
Most systems benefit from some swap: - Servers: RAM + zswap, small disk swap for emergencies - Desktops: RAM/2 to RAM, with zswap - Embedded: Often none (limited storage, predictable workload)
Common Issues
Swap Thrashing
Constant swapping makes system unusable.
Symptoms: High si/so in vmstat, system unresponsive
Solutions: - Add RAM - Reduce workload - Lower swappiness - Kill memory-hungry processes
Swap Full
No swap space available.
Symptoms: OOM kills despite "free" memory
Solutions: - Add more swap - Enable zswap/zram - Investigate memory usage
SSD Wear
Excessive swap writes wearing SSD.
Solutions: - Use zswap (reduces writes 2-5x) - Reduce swappiness - Add RAM
References
Key Code
| File | Description |
|---|---|
mm/swapfile.c |
Swap area management |
mm/swap_state.c |
Swap cache |
mm/zswap.c |
Compressed swap cache |
drivers/block/zram/ |
zram implementation |
Kernel Documentation
Related
- reclaim - When swap is triggered
- page-cache - File pages vs anonymous pages
- mmap - Anonymous memory that gets swapped