Skip to content

ARM64 Page Tables

TTBR0/TTBR1, translation granules, PTE format, and ASID

ARM64's MMU uses a two-register design to handle user and kernel address spaces independently. This page covers the hardware page table walk, how Linux structures its page table types, the PTE bit layout, ASID-based TLB management, and Stage 2 translation for KVM.


Two TTBRs: user and kernel address spaces

ARM64 splits the virtual address space at the midpoint using the top bit of the virtual address. Two translation base registers tell the MMU where to start the walk:

Register Address range Used for
TTBR0_EL1 [0, 2^VA_BITS) — low addresses User space (per-process)
TTBR1_EL1 [2^64 - 2^VA_BITS, 2^64) — high addresses Kernel (global)

With the default VA_BITS=48, user space occupies [0, 0x0000_FFFF_FFFF_FFFF] and the kernel occupies [0xFFFF_0000_0000_0000, 0xFFFF_FFFF_FFFF_FFFF]. The top bit (bit 63) of a virtual address selects the register: bit 63 = 0 uses TTBR0, bit 63 = 1 uses TTBR1. Any address with bits between VA_BITS-1 and 63 not all matching (i.e., not properly sign-extended) causes a Translation Fault.

52-bit VA (optional): ARMv8.7+ supports 52-bit virtual addresses (VA_BITS=52) with CONFIG_ARM64_VA_BITS_52. This still uses 4 translation levels (not 5); the extra bits come from widening the Level 0 index field. This extends the user and kernel ranges to 4PB each. Most production configs still use 48-bit.

On context switch, the kernel writes the new process's PGD physical address into TTBR0_EL1. The ASID (see below) is packed into bits [63:48] of TTBR0_EL1 at the same time. TTBR1_EL1 is set once at boot and never changes.

/* arch/arm64/include/asm/mmu_context.h */
static inline void cpu_switch_mm(pgd_t *pgd, struct mm_struct *mm)
{
    BUG_ON(pgd == swapper_pg_dir);
    cpu_set_reserved_ttbr0();
    /* ASID packed in mm->context.id; cpu_do_switch_mm writes TTBR0_EL1
     * with the new PGD PA and ASID together — no TLB flush needed when
     * switching to a different ASID (that is the whole point of ASIDs). */
    cpu_do_switch_mm(virt_to_phys(pgd), mm);
}

Translation granules

The translation granule is the base page size. ARM64 supports three granule sizes, each changing the number of levels needed and the size of each level's table:

Granule Page size Levels (48-bit VA) Notes
4KB 4KB 4 (L0→L1→L2→L3) Most common; default in Linux
16KB 16KB 4 (L0→L1→L2→L3) Apple Silicon uses this
64KB 64KB 3 (L1→L2→L3) Fewer levels; large contiguous TLB entries

The granule is selected by TCR_EL1.TG0 (for TTBR0) and TCR_EL1.TG1 (for TTBR1).

4KB granule: level coverage (48-bit VA)

With a 4KB granule each page table is exactly one 4KB page containing 512 8-byte entries (9 index bits per level):

Level   Index bits   Entries   Coverage per entry
─────   ──────────   ───────   ──────────────────
L0      [47:39]        512      512 GB
L1      [38:30]        512        1 GB
L2      [29:21]        512        2 MB  (block descriptor = huge page)
L3      [20:12]        512        4 KB  (page descriptor = normal page)
                                ────
Page offset [11:0]     —        4 KB

The 4-level page table walk (48-bit VA, 4KB granule)

For a 48-bit virtual address with a 4KB granule, the hardware performs a 4-level walk:

48-bit Virtual Address:
 47      39 38      30 29      21 20      12 11       0
 ┌─────────┬──────────┬──────────┬──────────┬──────────┐
 │ L0 index│ L1 index │ L2 index │ L3 index │  offset  │
 │ (9 bits)│ (9 bits) │ (9 bits) │ (9 bits) │ (12 bits)│
 └─────────┴──────────┴──────────┴──────────┴──────────┘

Walk (TTBR0 for user, TTBR1 for kernel):

TTBR0_EL1 or TTBR1_EL1 → physical address of L0 table
  L0_table[VA[47:39]]   → physical address of L1 table
    L1_table[VA[38:30]] → physical address of L2 table  (or 1GB block)
      L2_table[VA[29:21]] → physical address of L3 table (or 2MB block)
        L3_table[VA[20:12]] → physical address of 4KB page frame
          + VA[11:0]         = final physical address

The MMU caches translations in the TLB, keyed on VA + ASID. On a TLB miss, the hardware page table walker performs the walk automatically (hardware-managed TLB on ARM64 — no software TLB fill required in the common case).


PTE format

ARM64 descriptors are 64-bit values. The meaning of bits depends on the level and whether the entry is a table, block, or page descriptor.

ARM64 Page/Block Descriptor (64-bit):

 63   59 58   55 54 53 52 51      12 11 10  9  8  7  6  5  4  2  1  0
 ┌──────┬───────┬──┬──┬──┬──────────┬──┬──┬──┬──┬──┬──┬─────┬──┬──┐
 │ PBHA │ SW   │UXN│PXN│Cont│  OA   │nG│AF│SH│AP│NS│  │AtIdx│type│V│
 └──────┴───────┴──┴──┴──┴──────────┴──┴──┴──┴──┴──┴──┴─────┴──┴──┘

 bit 0:     Valid (V)        — 1 = entry is valid
 bit 1:     Type             — at L0-L2: 1 = table descriptor, 0 = block descriptor
                               at L3: 1 = page descriptor (must be 1 for valid pages)
 bits [4:2]: AttrIndx[2:0]  — index into MAIR_EL1 (memory type selection)
 bit 5:     NS               — Non-Secure (applies in Secure state)
 bits [7:6]: AP[2:1]         — Access Permissions (see table below)
 bits [9:8]: SH[1:0]         — Shareability: 00=non-shareable, 10=outer, 11=inner
 bit 10:    AF               — Access Flag: fault on first access if 0 (SW manages)
 bit 11:    nG               — not-Global: 1 = ASID-tagged (user), 0 = global (kernel)
 bits [47:12]: Output Address (OA) — physical page frame number (bits [47:12] of PA)
 bit 52:    Contiguous hint  — TLB can merge contiguous entries
 bit 53:    PXN              — Privileged Execute Never (EL1 cannot execute)
 bit 54:    UXN              — Unprivileged Execute Never (EL0 cannot execute)
 bits [58:55]: SW            — Software-defined (kernel uses for _PAGE_* flags)
 bits [63:59]: PBHA          — Page-Based Hardware Attributes (ARMv8.2+)

Access Permission (AP[2:1]) encoding

AP[2:1] EL1 (kernel) EL0 (user)
00 Read/Write No access
01 Read/Write Read/Write
10 Read-Only No access
11 Read-Only Read-Only

Linux sets UXN on all kernel mappings and PXN on all user mappings (preventing user code from being executed at EL1 and vice versa). The AF (Access Flag) bit: on pre-ARMv8.1 hardware it is managed by software — a PTE with AF=0 generates an Access Flag Fault on first access, which the kernel handles by setting AF=1. On ARMv8.1+ hardware with FEAT_HAFDBS, Linux enables hardware-managed AF (TCR_EL1.HA=1), so the CPU sets AF automatically without faulting. The Access Flag is used by the page reclaim path to distinguish recently-accessed pages.

ARM64 vs x86-64 PTE comparison

Concept ARM64 x86-64
Valid bit bit 0 (V) bit 0 (P — Present)
Read/Write AP[2:1] field bit 1 (R/W)
User access AP[2:1] field bit 2 (U/S)
Execute disable UXN (bit 54), PXN (bit 53) bit 63 (NX)
Cache type AttrIndxMAIR_EL1 PCD/PWT/PAT bits
Huge page marker bit 1 = 0 at L1/L2 (block) PS bit at PMD/PUD
Global (no ASID flush) nG = 0 bit 8 (G)
Dirty / Accessed Software-managed via AF fault Hardware-set bits 5 (A) and 6 (D)

Huge pages: block descriptors

ARM64 uses block descriptors to map large contiguous physical regions without traversing all four levels. A block descriptor at L1 or L2 has bit 1 (type) = 0:

Level Block size Linux term
L1 1 GB pud_huge() — 1GB huge page
L2 2 MB pmd_huge() — 2MB huge page (THP, hugetlbfs)

Linux detects block descriptors with pmd_huge() defined in arch/arm64/include/asm/pgtable.h:

/* arch/arm64/include/asm/pgtable.h */
static inline int pmd_huge(pmd_t pmd)
{
    return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT);
}

PMD_TABLE_BIT is bit 1. A valid PMD entry with bit 1 clear is a 2MB block descriptor.

Transparent Huge Pages (THP) and hugetlbfs both use 2MB block descriptors. The kernel also maps its own text and data sections with 2MB blocks when alignment allows (visible in arch/arm64/mm/mmu.c via create_mapping_noalloc()).


Linux kernel page table types

Linux uses a generic five-level abstraction over the hardware levels. On ARM64 with 48-bit VA and 4KB granule, p4d_t is folded (a no-op alias for pgd_t):

Linux type Hardware level Source
pgd_t L0 (pointed to by TTBR0/TTBR1) arch/arm64/include/asm/pgtable-types.h
p4d_t Folded into PGD (48-bit VA) include/asm-generic/pgtable-nop4d.h
pud_t L1 arch/arm64/include/asm/pgtable-types.h
pmd_t L2 arch/arm64/include/asm/pgtable-types.h
pte_t L3 arch/arm64/include/asm/pgtable-types.h

All types are wrappers around u64. The underlying hardware descriptor value is accessed with pgd_val(), pud_val(), pmd_val(), pte_val().

Standard page table navigation macros:

/* Given a mm_struct and virtual address, walk down to the PTE: */
pgd_t *pgd = pgd_offset(mm, addr);       /* index into mm->pgd */
p4d_t *p4d = p4d_offset(pgd, addr);      /* folded on 48-bit ARM64 */
pud_t *pud = pud_offset(p4d, addr);
pmd_t *pmd = pmd_offset(pud, addr);
pte_t *pte = pte_offset_map(pmd, addr);  /* maps the PTE page, returns kernel VA */

pte_offset_map() is defined in arch/arm64/include/asm/pgtable.h and handles the physical-to-virtual translation for the PTE page pointer.


ASID: Address Space Identifiers

Every user-space mapping in a PTE has nG (not-Global) = 1. This means the TLB tags the entry with the current ASID (Address Space Identifier). Different processes with different ASIDs can coexist in the TLB without interference, avoiding a full TLB flush on every context switch.

ASID storage

The ASID is stored in TTBR0_EL1[63:48] (16-bit ASID field, when TCR_EL1.AS=1). Each time a process is scheduled, the kernel writes both the ASID and the PGD address into TTBR0_EL1 atomically. Kernel mappings use nG=0 (global), so they are not tagged with an ASID and are never evicted by ASID-based TLB operations.

The per-process ASID is stored in mm->context.id and managed by check_and_switch_context() in arch/arm64/mm/context.c:

/* arch/arm64/mm/context.c */
void check_and_switch_context(struct mm_struct *mm)
{
    unsigned long flags;
    unsigned int cpu;
    u64 asid, old_active_asid;

    asid = atomic64_read(&mm->context.id);

    /*
     * If the current ASID is still valid for this CPU, use it without
     * taking the lock.  Otherwise fall through to the slow path.
     */
    old_active_asid = atomic64_read(this_cpu_ptr(&active_asids));
    if (asid_gen_match(asid) &&
        atomic64_cmpxchg_relaxed(this_cpu_ptr(&active_asids),
                                 old_active_asid, asid))
        goto switch_mm_fastpath;

    raw_spin_lock_irqsave(&cpu_asid_lock, flags);
    /* ... slow path: allocate new ASID or flush on generation wrap ... */
    raw_spin_unlock_irqrestore(&cpu_asid_lock, flags);

switch_mm_fastpath:
    cpu_switch_mm(mm->pgd, mm);
}

ASID generation wrap

ARM64 supports 8-bit or 16-bit ASIDs (TCR_EL1.AS). Linux detects ASID width at runtime from ID_AA64MMFR0_EL1.ASIDBits and uses 16-bit ASIDs where available. With 16-bit ASIDs there are 65536 possible values. When they are exhausted, the kernel bumps an internal generation counter and flushes the entire TLB (TLBI VMALLE1IS), then reallocates ASIDs from scratch.


MAIR_EL1: Memory Attribute Indirection Register

The AttrIndx[2:0] field in every PTE is an index into MAIR_EL1, which maps 8 attribute slots to actual memory types. This indirection allows PTE format to stay compact while supporting many memory types.

Linux sets up MAIR_EL1 once at boot in arch/arm64/mm/proc.S:

/* arch/arm64/mm/proc.S — MAIR_EL1 setup during __cpu_setup */
/*
 * MAIR_EL1 attribute encoding (one byte per slot, 8 slots = 64 bits):
 *
 * Slot  AttrIndx  Attr byte  Memory type
 * ────  ────────  ─────────  ───────────────────────────────────────
 *   0     0b000   0x00       Device-nGnRnE  (strongly ordered device)
 *   1     0b001   0x04       Device-nGnRE   (device, gathering+reordering ok)
 *   2     0b010   0x0C       Device-GRE     (device, gathering+reordering+early-write ok)
 *   3     0b011   0x44       Normal Non-Cacheable (NC)
 *   4     0b100   0xFF       Normal Write-Back, Read-Allocate, Write-Allocate (WB)
 *   5     0b101   0xBB       Normal Write-Through, Read-Allocate (WT)
 *   6     0b110   0x40       Normal Non-Cacheable Outer, NC Inner
 *   7     0b111   0xFF       Normal WB (duplicate; used by Tagged Normal memory)
 *
 * Linux primary slots:
 *   MT_DEVICE_nGnRnE  = 0  (ioremap strongly ordered)
 *   MT_DEVICE_nGnRE   = 1  (ioremap device)
 *   MT_DEVICE_GRE     = 2  (ioremap write-combining)
 *   MT_NORMAL_NC      = 3  (DMA non-cacheable)
 *   MT_NORMAL         = 4  (normal RAM — all kernel/user memory)
 *   MT_NORMAL_WT      = 5  (write-through)
 */
ldr x5, =MAIR_EL1_SET
msr mair_el1, x5
isb

The constant MAIR_EL1_SET is built from individual MAIR_ATTRIDX() macros in arch/arm64/include/asm/pgtable-hwdef.h. User-space and kernel text/data use slot 4 (MT_NORMAL, AttrIndx=0b100). Device memory mapped via ioremap() uses slot 0 (MT_DEVICE_nGnRnE).


TLB operations

ARM64 uses broadcast TLB invalidation instructions (TLBI) rather than IPIs. These instructions propagate across CPUs within a shareability domain automatically when the IS (Inner Shareable) suffix is used.

Key TLBI instructions

Instruction Invalidates When to use
TLBI VAE1IS, Xt Entry by VA + current ASID, inner shareable Single page unmap (user)
TLBI VALE1IS, Xt Entry by VA + ASID, last-level only More common for leaf pages
TLBI ASIDE1IS, Xt All entries matching ASID, inner shareable Process exit (mm teardown)
TLBI VMALLE1IS All EL1 entries (all ASIDs), inner shareable ASID generation wrap
TLBI VAAE1IS, Xt Entry by VA, all ASIDs Kernel mapping change

The operand Xt encodes the VA shifted right by 12 (page-aligned) ORed with the ASID in bits [63:48].

Required TLB invalidation sequence

Writing a new PTE and then invalidating the TLB must follow a strict ordering to prevent the MMU from caching a stale translation:

1. Write new PTE to memory
2. DSB ISHST       — ensure the PTE store is visible to all CPUs before TLBI
3. TLBI VAE1IS     — broadcast TLB invalidation
4. DSB ISH         — wait for TLB invalidation to complete on all CPUs
5. ISB             — flush instruction pipeline (see new mappings for instruction fetch)

In Linux C code, flush_tlb_page() in arch/arm64/include/asm/tlbflush.h implements this:

/* arch/arm64/include/asm/tlbflush.h */
static inline void flush_tlb_page(struct vm_area_struct *vma,
                                   unsigned long uaddr)
{
    unsigned long addr = __TLBI_VADDR(uaddr, ASID(vma->vm_mm));

    dsb(ishst);              /* DSB ISH ST: wait for PTE stores to complete */
    __tlbi(vale1is, addr);   /* TLBI VALE1IS: invalidate last-level TLB entry */
    dsb(ish);                /* DSB ISH: wait for TLB invalidation to propagate */
}

Note there is no ISB in flush_tlb_page() itself because it is only used for data mappings; the ISB is required only when instruction mappings change (e.g., flush_icache_range()).

__TLBI_VADDR() is a macro that encodes the page-aligned address and ASID into the format expected by the TLBI operand. __tlbi() expands to an inline assembly TLBI instruction via the SYS instruction encoding.


Stage 2 translation (KVM)

When KVM is active, the CPU runs at EL2 and guests run at EL1/EL0. Guest virtual addresses are translated first by the Stage 1 page tables (guest OS's own TTBR0_EL1/TTBR1_EL1) to Intermediate Physical Addresses (IPA), then by Stage 2 page tables to real Physical Addresses (PA).

Stage 2 is controlled by VTTBR_EL2, which holds the physical address of the Stage 2 PGD (the IPA→PA translation table). The Stage 2 tables use a similar descriptor format to Stage 1 but with different attribute fields (S2AP, S2SH, MemAttr).

Guest VA ──[Stage 1: TTBR0/TTBR1]──► IPA ──[Stage 2: VTTBR_EL2]──► PA
                 (Guest OS controls)              (KVM controls)

TLB invalidation for guests requires TLBI instructions with the VMID suffix (e.g., TLBI IPAS2E1IS) to invalidate only entries for the current VM's VMID. The VMID is stored in VTTBR_EL2[55:48].

Stage 2 management in Linux lives in arch/arm64/kvm/mmu.c. See the KVM Architecture and Memory Virtualization docs for details.


Observing ARM64 page tables

# Virtual address layout on ARM64
dmesg | grep -E "Virtual kernel memory layout" -A 20

# Per-process page table memory
cat /proc/$$/status | grep VmPTE

# Memory mapping of current process
cat /proc/$$/maps
cat /proc/$$/smaps   # includes page-level breakdown

# Check VA_BITS (kernel config)
zcat /proc/config.gz | grep ARM64_VA_BITS

# Check translation granule
zcat /proc/config.gz | grep ARM64_4K_PAGES

# TLB miss rate (requires PMU access)
perf stat -e dTLB-load-misses,iTLB-load-misses <command>

# ASID allocation (DEBUG_VM builds expose asid info via dmesg)
dmesg | grep -i asid

# Page table dump (requires CONFIG_PTDUMP_DEBUGFS)
ls /sys/kernel/debug/page_tables/
cat /sys/kernel/debug/page_tables/kernel

Key kernel functions and files

Symbol File Purpose
pgd_t, pud_t, pmd_t, pte_t arch/arm64/include/asm/pgtable-types.h Page table entry types
pgd_offset(), pud_offset(), pmd_offset() arch/arm64/include/asm/pgtable.h VA → page table level pointer
pte_offset_map() arch/arm64/include/asm/pgtable.h Map PTE page, return pointer
pmd_huge() arch/arm64/include/asm/pgtable.h Detect 2MB block descriptor
pud_huge() arch/arm64/include/asm/pgtable.h Detect 1GB block descriptor
set_pte_at() arch/arm64/mm/pgtable.c Write PTE (with barrier)
flush_tlb_page() arch/arm64/include/asm/tlbflush.h Single-page TLB invalidation
flush_tlb_mm() arch/arm64/include/asm/tlbflush.h Full mm TLB flush (ASIDE1IS)
check_and_switch_context() arch/arm64/mm/context.c ASID allocation and mm switch
cpu_do_switch_mm() arch/arm64/mm/proc.S Write TTBR0_EL1 with new ASID+PGD
__cpu_setup() arch/arm64/mm/proc.S Configure TCR_EL1, MAIR_EL1, SCTLR_EL1
create_mapping_noalloc() arch/arm64/mm/mmu.c Build kernel page table entries
MAIR_EL1_SET arch/arm64/include/asm/pgtable-hwdef.h MAIR value built from slot macros

Further reading

  • Memory Model — ARM64 weak memory ordering, DSB/ISB, page table barriers
  • Exception Model — EL0–EL3, how EL2 relates to Stage 2 translation
  • Page Tables (generic) — Linux pgd/pud/pmd/pte abstraction layer
  • Page Fault Handler — how the kernel handles Translation Faults and Access Flag Faults
  • Transparent Huge Pages — how Linux promotes 4KB pages to 2MB block descriptors
  • KVM Memory Virtualization — Stage 2 page tables, EPT/NPT analog on ARM64
  • TLB Optimization — batched TLB invalidation and mmu_gather
  • ARM Architecture Reference Manual (DDI 0487) — D5: AArch64 Virtual Memory System Architecture
  • Documentation/arch/arm64/memory.rst in the kernel tree — ARM64 virtual address layout