MGLRU: Multi-Generation LRU
A generational page reclaim algorithm for better working set detection
The problem with classic LRU
The classic LRU (Least Recently Used) page reclaim uses two lists per memory zone: active and inactive. Pages migrate between them based on access. The fundamental problems:
- The scan cost: To find reclaimable pages, the reclaimer walks the inactive list. If the inactive list is full of hot pages (false negatives), it wastes CPU time.
- One-time access pollution: A large
ddortaroperation reads every file once, thrashing the active list with cold pages. - No temporal locality: Classic LRU treats a page accessed 1 second ago the same as one accessed 1 year ago.
MGLRU's generational model
MGLRU (Multi-Generation LRU, merged in 6.1) divides pages into generations. Each generation represents pages that became hot at roughly the same time. The oldest generation is evicted first:
Generation 0 (oldest): pages not accessed since earliest timestamp
Generation 1: pages accessed in the second-oldest period
...
Generation N (newest): recently accessed pages
Reclaim: evict from generation 0 first → if empty, evict generation 1 → ...
Age: promote pages to the newest generation when accessed
Typical number of generations: 4. Each generation is ~seconds to minutes of activity.
Key data structures
/* mm/vmscan.c (6.1+) */
/* Per-node, per-type (anon/file) LRU state */
struct lru_gen_folio {
/* Min/max generation counters */
unsigned long max_seq; /* newest generation */
unsigned long min_seq[ANON_AND_FILE]; /* oldest non-empty generation */
/* Per-generation page counts */
long nr_pages[MAX_NR_GENS][ANON_AND_FILE][MAX_NR_ZONES];
/* Bloom filter for faster refault detection */
unsigned long *filters[NR_BLOOM_FILTERS];
atomic_long_t nr_evicted[ANON_AND_FILE];
atomic_long_t nr_refaulted[ANON_AND_FILE];
atomic_long_t protected[NR_HIST_GENS][ANON_AND_FILE][NR_BIRTH_MARKS];
};
/* Per-folio generation tracking embedded in folio flags */
/* gen = folio_lru_gen(folio): which generation this folio is in */
Aging: scanning for hot pages
Aging walks page tables to find recently accessed pages and promote them to the newest generation:
/* mm/vmscan.c */
static bool walk_mm(struct lruvec *lruvec, struct mm_struct *mm,
struct lru_gen_walk_control *lwc)
{
struct lru_gen_mm_walk walk = {
.lruvec = lruvec,
.seq = min_seq(lruvec, ANON_AND_FILE),
/* ... */
};
/* Walk all VMAs, checking PTE accessed bits */
walk_page_range(mm, 0, ULONG_MAX, &lru_gen_mm_walk_ops, &walk);
return walk.force_scan;
}
The accessed bit in PTEs tells the kernel which pages have been touched since the last aging pass. Pages with the accessed bit set are promoted to the newest generation; those without are left in their old generation (and will be evicted sooner).
Eviction: reclaiming old generations
Eviction reclaims pages from the oldest generation:
/* mm/vmscan.c */
static long evict_folios(struct lruvec *lruvec, struct scan_control *sc,
swp_entry_t *swpent)
{
int type = get_type_to_scan(lruvec, sc, &tier);
long nr_to_scan = get_nr_to_scan(lruvec, sc, can_age, type);
long nr_reclaimed = 0;
/* Get the oldest generation */
unsigned long min_seq = READ_ONCE(lrugen->min_seq[type]);
/* Collect folios from the oldest generation */
isolate_folios(lruvec, sc, type, min_seq, &list);
/* Try to reclaim them (writeback, swap, or just free) */
nr_reclaimed = shrink_folio_list(&list, pgdat, sc, &stat, false);
/* If min generation is now empty, advance min_seq */
if (!nr_pages_in_gen(lruvec, type, min_seq))
WRITE_ONCE(lrugen->min_seq[type], min_seq + 1);
return nr_reclaimed;
}
Working set protection: refault detection
MGLRU detects when evicted pages are immediately faulted back in (refaults), indicating the working set was violated. It uses a Bloom filter:
- When a page is evicted, its fingerprint is added to the Bloom filter
- When a page is faulted in, MGLRU checks if it was recently evicted
- If yes: the page is placed in a protected generation — older than the newest but protected from immediate eviction
/* Refault detection */
static bool lru_gen_test_recent(struct lruvec *lruvec, bool file,
struct folio *folio, ...)
{
struct lru_gen_folio *lrugen = &lruvec->lrugen;
/* Check Bloom filter: was this folio recently evicted? */
return test_bloom_filter(lrugen, lrugen->min_seq[file], folio);
}
Comparison: classic LRU vs MGLRU
| Aspect | Classic LRU | MGLRU |
|---|---|---|
| Data structure | active + inactive lists | 4 generation lists |
| Reclaim target | scan inactive list | evict oldest generation |
| Access tracking | active/inactive promotion | PTE accessed bit scan |
| Working set detection | refault distance approximation | Bloom filter per generation |
| Streaming reads | pollutes active list | one-time accesses stay in old gen |
| Memory overhead | 2 list_head per folio | generation counters + Bloom filter |
Configuration
# MGLRU is enabled by default in 6.1+
cat /sys/kernel/mm/lru_gen/enabled
# 0x0007 ← bitmask: 1=enabled, 2=mglru, 4=aging
# Enable/disable components
echo 0 > /sys/kernel/mm/lru_gen/enabled # disable MGLRU (use classic LRU)
echo 0x0007 > /sys/kernel/mm/lru_gen/enabled # enable all
# Min LRU generations (default 2)
cat /sys/kernel/mm/lru_gen/min_ttl_ms
# 0 ← no minimum lifetime
# Set minimum time a generation stays before eviction
echo 1000 > /sys/kernel/mm/lru_gen/min_ttl_ms # protect pages for 1s
# Number of generations: controlled by MAX_NR_GENS (compile-time, usually 4)
Observing MGLRU
# Generation distribution per node/memcg/zone
cat /sys/kernel/debug/lru_gen
# Output format:
# memcg N lruvec aged anon 0 1 2 3
# evicted anon 0 1 2 3
# aged file 0 1 2 3
# evicted file 0 1 2 3
# Memory pressure stats
cat /proc/vmstat | grep -E "pgpromote|pgdemote|lru_gen"
# pgpromote_success 12345 ← pages promoted to newer generation
# pgdemote_kswapd 67890 ← pages demoted (aged into older generation)
# Refault rate (working set violations)
cat /proc/vmstat | grep pgrefault
# pgrefault 1234
# perf: track LRU events
perf stat -e vmscan:mm_vmscan_lru_isolate -a sleep 5
Further reading
- Page Reclaim — overall reclaim framework
- Reclaim Throttling — memory pressure signaling
- Swap — where anon pages go when evicted
- Page Cache — file-backed page lifecycle
mm/vmscan.cin the kernel tree — MGLRU implementationDocumentation/admin-guide/mm/multigen_lru.rstin the kernel tree