Skip to main content Link Menu Expand (external link) Document Search Copy Copied

Understanding Page Tables in xv6 on RISC-V

Introduction to Virtual Memory

Virtual memory is a memory management technique that provides an abstraction layer between memory addresses used by a program (virtual addresses) and the physical memory addresses in RAM (physical addresses). In modern operating systems like xv6, virtual memory serves several critical purposes:

  1. Process isolation: Each process has its own address space, isolated from other processes
  2. Memory protection: The OS can control which memory regions are accessible to a process
  3. Overcommitment: The system can allocate more virtual memory than physically available
  4. Simplified memory model: Programs can use a contiguous address space regardless of physical memory fragmentation

How Page Tables Work: A General Overview

The Concept of Pages and Frames

In virtual memory systems, both virtual and physical memory are divided into fixed-size blocks:

  • Pages: Fixed-size blocks of virtual memory (typically 4KB)
  • Frames: Fixed-size blocks of physical memory (same size as pages)

Page tables perform the crucial task of mapping virtual pages to physical frames.

Basic Page Table Structure

At its most basic level, a page table is simply an array that serves as a lookup table:

  • The index into the array is derived from the virtual page number
  • The value stored at that index contains the physical frame number

For example, in a system with 4KB pages, a 32-bit virtual address could be divided into:

  • Upper 20 bits: Virtual Page Number (VPN)
  • Lower 12 bits: Offset within the page (2^12 = 4096 bytes)

Similarly, a physical address would consist of:

  • Upper bits: Physical Frame Number (PFN)
  • Lower 12 bits: Same offset within the frame

Address Translation Process

When a program accesses memory using a virtual address, the following happens:

  1. The MMU extracts the virtual page number from the virtual address
  2. It looks up the VPN in the page table to find the corresponding physical frame
  3. It combines the physical frame number with the original offset to form the physical address
  4. The physical address is used to access the actual memory location
Virtual Address: [Virtual Page Number | Offset]
                          |
                          v
                     Page Table
                          |
                          v
Physical Address: [Physical Frame Number | Offset]

Multi-level Page Tables

Single-level page tables become impractical for large address spaces because:

  • They require a very large contiguous area of memory
  • Most of the table entries might never be used

Multi-level page tables solve this problem by dividing the page table into a hierarchy:

  • The top-level page table contains pointers to second-level page tables
  • The second-level tables contain pointers to third-level tables, and so on
  • The lowest-level tables contain the actual physical frame numbers

This hierarchical approach allows the OS to allocate page tables on demand, saving memory for unused portions of the address space.

Two-Level Page Table (32-bit Virtual Address)

In a 32-bit architecture with 4KB pages, a typical two-level page table might divide the address as follows:

  • 10 bits for Level 1 index (addressing 1024 entries in the top page table)
  • 10 bits for Level 0 index (addressing 1024 entries in the second-level page table)
  • 12 bits for page offset (4KB pages)
flowchart LR
    VA["Virtual Address<br>[31:22 | 21:12 | 11:0]"] --> L1["L1 Index<br>[31:22]"]
    VA --> L0["L0 Index<br>[21:12]"]
    VA --> OFF["Page Offset<br>[11:0]"]
    
    L1 -->|"Index into"| PT1["Level 1 Page Table<br>(Root Page Table)"]
    PT1 -->|"Entry points to"| PT0["Level 0 Page Table"]
    
    L0 -->|"Index into"| PT0
    PT0 -->|"Entry contains"| PFN["Physical Frame Number"]
    
    PFN -->|"Combined with"| OFF
    OFF -->|"Forms"| PA["Physical Address<br>[PFN | Page Offset]"]
    
    class VA,PA highlight

Three-Level Page Table (RISC-V Sv39 - 39-bit Virtual Address)

In the RISC-V Sv39 scheme (as used in xv6), the 39-bit virtual address is divided as:

  • 9 bits for Level 2 index (bits 30-38)
  • 9 bits for Level 1 index (bits 21-29)
  • 9 bits for Level 0 index (bits 12-20)
  • 12 bits for page offset (bits 0-11)
flowchart LR
    VA["Virtual Address<br>[38:30 | 29:21 | 20:12 | 11:0]"] --> L2["L2 Index<br>[38:30]"]
    VA --> L1["L1 Index<br>[29:21]"]
    VA --> L0["L0 Index<br>[20:12]"]
    VA --> OFF["Page Offset<br>[11:0]"]
    
    L2 -->|"Index into"| PT2["Level 2 Page Table<br>(Root Page Table)"]
    PT2 -->|"Entry points to"| PT1["Level 1 Page Table"]
    
    L1 -->|"Index into"| PT1
    PT1 -->|"Entry points to"| PT0["Level 0 Page Table"]
    
    L0 -->|"Index into"| PT0
    PT0 -->|"Entry contains"| PFN["Physical Frame Number"]
    
    PFN -->|"Combined with"| OFF
    OFF -->|"Forms"| PA["Physical Address<br>[PFN | Page Offset]"]
    
    class VA,PA highlight

Each page table in this hierarchy is itself a page (4KB) containing 512 entries (each entry is 64 bits or 8 bytes, and 512 * 8 = 4096 bytes). The physical address stored in a PTE is the address of the next-level page table (for non-leaf entries) or the actual physical frame (for leaf entries).

Page Table Entries (PTEs)

Each entry in a page table typically contains:

  • The physical frame number
  • Permission bits (read, write, execute)
  • Status bits (valid, dirty, accessed)
  • Other control information

The valid bit is particularly important - it indicates whether the mapping is valid. If this bit is not set, the MMU generates a page fault, which the OS can handle (e.g., by loading data from disk).

Translation Lookaside Buffer (TLB)

To speed up the translation process, modern CPUs include a Translation Lookaside Buffer (TLB), which caches recent virtual-to-physical address translations:

  1. Before checking the page table, the MMU first looks in the TLB
  2. If the translation is found (TLB hit), it’s used directly, avoiding the page table lookup
  3. If not found (TLB miss), the full page table walk is performed, and the result is cached in the TLB

When the OS modifies the page tables, it must invalidate the affected TLB entries to prevent incorrect translations.

Page Faults and Memory Management

When a program tries to access an invalid page (one not currently mapped to physical memory), a page fault occurs:

  1. The MMU raises a page fault exception
  2. The OS takes control and determines the cause of the fault
  3. The OS might load data from disk, allocate a new frame, or terminate the process (segmentation fault)
  4. If the memory access can be satisfied, the OS updates the page table and returns control to the program

This mechanism enables virtual memory’s powerful features like demand paging, memory-mapped files, and copy-on-write.

The Relationship Between Virtual and Physical Address Spaces

Below are diagrams illustrating how virtual and physical address spaces relate in a system like xv6.

User Virtual Address Space to Physical Memory

flowchart TD
    subgraph "User Virtual Address Space"
        VA_Code["Code (Text)<br>0x0000 - 0x1FFF"]
        VA_Data["Data<br>0x2000 - 0x3FFF"]
        VA_Heap["Heap<br>0x4000 - 0x5FFF"]
        VA_Stack["Stack<br>0x6000 - 0x7FFF"]
        VA_High["Trapframe & Trampoline<br>0xFFFF..."]
    end

    subgraph "Page Tables"
        PT["Multi-level Page Table<br>(maps VPN to PFN)"]
    end

    subgraph "Physical Memory"
        PM_Frame1["Frame 1<br>0x80002000 - 0x80002FFF"]
        PM_Frame2["Frame 2<br>0x80005000 - 0x80005FFF"]
        PM_Frame3["Frame 3<br>0x80008000 - 0x80008FFF"]
        PM_Frame4["Frame 4<br>0x8000A000 - 0x8000AFFF"]
        PM_Frame5["Frame 5<br>0x8000F000 - 0x8000FFFF"]
        PM_High["Kernel-reserved Frames<br>..."]
    end

    VA_Code --> PT
    VA_Data --> PT
    VA_Heap --> PT
    VA_Stack --> PT
    VA_High --> PT

    PT --> PM_Frame4
    PT --> PM_Frame1
    PT --> PM_Frame3
    PT --> PM_Frame5
    PT --> PM_High

User and Kernel Virtual Address Spaces to Physical Memory

flowchart TD
    subgraph "User Virtual Address Space"
        UVA_Low["User Code/Data/Heap<br>0x0000 - 0x7FFF"]
        UVA_High["Trap Mechanisms<br>TRAMPOLINE"]
    end

    subgraph "Kernel Virtual Address Space"
        KVA_DevMem["Device Memory<br>UART, VIRTIO, PLIC"]
        KVA_KernelCode["Kernel Code<br>KERNBASE - etext"]
        KVA_KernelData["Kernel Data<br>etext - PHYSTOP"]
        KVA_Trampoline["Trampoline<br>TRAMPOLINE"]
    end

    subgraph "Physical Memory"
        Phys_DevMem["Device Memory<br>0x10000000 - ..."]
        Phys_KernelCode["Kernel Code<br>0x80000000 - ..."]
        Phys_KernelData["Kernel Data<br>..."]
        Phys_UserPages["User Pages<br>Scattered throughout RAM"]
        Phys_Trampoline["Trampoline Code<br>..."]
    end

    UVA_Low --> UserPT["User Page Table"]
    UVA_High --> UserPT

    KVA_DevMem --> KernelPT["Kernel Page Table"]
    KVA_KernelCode --> KernelPT
    KVA_KernelData --> KernelPT
    KVA_Trampoline --> KernelPT

    UserPT --> Phys_UserPages
    UserPT --> Phys_Trampoline

    KernelPT --> Phys_DevMem
    KernelPT --> Phys_KernelCode
    KernelPT --> Phys_KernelData
    KernelPT --> Phys_UserPages
    KernelPT --> Phys_Trampoline

    %% Note that kernel mappings are largely identity mappings
    classDef kernelMappings fill:#f9f,stroke:#333,stroke-width:2px
    class KVA_DevMem,KVA_KernelCode,KVA_KernelData kernelMappings

The RISC-V architecture, on which xv6 runs, implements virtual memory through a hardware mechanism called the Memory Management Unit (MMU). The MMU translates virtual addresses to physical addresses using data structures called page tables.

RISC-V Sv39 Page Table Architecture

xv6 uses RISC-V’s Sv39 virtual memory scheme, which provides a 39-bit virtual address space. This means it can address up to 512GB of virtual memory (2^39 bytes).

Virtual Address Structure

In the Sv39 scheme, a 64-bit virtual address is structured as follows:

63   39 38           30 29           21 20           12 11            0
+-------+---------------+---------------+---------------+---------------+
| Unused|   L2 Index    |   L1 Index    |   L0 Index    |    Offset     |
| (must |   (9 bits)    |   (9 bits)    |   (9 bits)    |   (12 bits)   |
| be 0) |               |               |               |               |
+-------+---------------+---------------+---------------+---------------+
  • Bits 0-11: 12-bit page offset (4KB pages)
  • Bits 12-20: 9-bit Level-0 page table index
  • Bits 21-29: 9-bit Level-1 page table index
  • Bits 30-38: 9-bit Level-2 page table index
  • Bits 39-63: Must be all zeros (or all ones, sign-extended from bit 38)

xv6 defines this structure in the vm.c file:

// The risc-v Sv39 scheme has three levels of page-table
// pages. A page-table page contains 512 64-bit PTEs.
// A 64-bit virtual address is split into five fields:
//   39..63 -- must be zero.
//   30..38 -- 9 bits of level-2 index.
//   21..29 -- 9 bits of level-1 index.
//   12..20 -- 9 bits of level-0 index.
//    0..11 -- 12 bits of byte offset within the page.

Page Table Structure

RISC-V Sv39 uses a three-level page table structure. Each page table is exactly one page (4KB) in size and contains 512 page table entries (PTEs):

  • The L2 (top level) page table contains 512 PTEs, each pointing to an L1 page table
  • Each L1 page table contains 512 PTEs, each pointing to an L0 page table
  • Each L0 page table contains 512 PTEs, each pointing to a physical page

This hierarchical structure allows for efficient memory usage since page tables can be allocated only when needed.

Page Table Entries (PTEs)

Each page table entry in RISC-V is a 64-bit value that either points to the next level page table or to a physical page (leaf entry). The format of a PTE is:

63           54 53        28 27        19 18        10 9   8 7 6 5 4 3 2 1 0
+---------------+------------+------------+------------+-----+-+-+-+-+-+-+-+-+
|    Reserved   | PPN[2]     | PPN[1]     | PPN[0]     | RSW |D|A|G|U|X|W|R|V|
+---------------+------------+------------+------------+-----+-+-+-+-+-+-+-+-+
  • Bits 0-9: Flags
    • V (0): Valid bit. If 0, the PTE is invalid
    • R (1): Read permission
    • W (2): Write permission
    • X (3): Execute permission
    • U (4): User mode access
    • G (5): Global mapping (ignored in xv6)
    • A (6): Accessed (ignored in xv6)
    • D (7): Dirty (ignored in xv6)
    • RSW (8-9): Reserved for supervisor software (unused in xv6)
  • Bits 10-53: Physical Page Number (PPN) - the physical address divided by 4096
    • PPN[0] (bits 10-18): L0 physical page number
    • PPN[1] (bits 19-27): L1 physical page number
    • PPN[2] (bits 28-53): L2 physical page number

In xv6, the PTE flags are defined in riscv.h:

#define PTE_V (1L << 0) // valid
#define PTE_R (1L << 1) // readable
#define PTE_W (1L << 2) // writable
#define PTE_X (1L << 3) // executable
#define PTE_U (1L << 4) // user can access

xv6 also provides macros to convert between physical addresses and PTEs:

#define PA2PTE(pa) ((((uint64)pa) >> 12) << 10)
#define PTE2PA(pte) (((pte) >> 10) << 12)
#define PTE_FLAGS(pte) ((pte) & 0x3FF)

Page Table Operations in xv6

Walking Page Tables

The walk function in vm.c is central to page table operations. It navigates the page table hierarchy to find the PTE for a given virtual address:

pte_t *
walk(pagetable_t pagetable, uint64 va, int alloc)
{
  // Check if virtual address is valid
  if(va >= MAXVA)
    panic("walk");

  // Walk through page table levels
  for(int level = 2; level > 0; level--) {
    // Get PTE index for this level
    pte_t *pte = &pagetable[PX(level, va)];
    
    // If PTE is valid, get next level page table
    if(*pte & PTE_V) {
      pagetable = (pagetable_t)PTE2PA(*pte);
    } else {
      // If PTE is invalid and we're not allocating, return null
      if(!alloc || (pagetable = (pde_t*)kalloc()) == 0)
        return 0;
      // Initialize new page table page
      memset(pagetable, 0, PGSIZE);
      *pte = PA2PTE(pagetable) | PTE_V;
    }
  }
  // Return pointer to final level PTE
  return &pagetable[PX(0, va)];
}

This function is used for:

  1. Finding an existing mapping
  2. Creating new page table levels when needed (if alloc is true)
  3. Returning the address of the PTE so it can be modified

Creating Mappings

The mappages function creates mappings from virtual to physical addresses:

int
mappages(pagetable_t pagetable, uint64 va, uint64 size, uint64 pa, int perm)
{
  uint64 a, last;
  pte_t *pte;

  // Ensure addresses are page-aligned
  if((va % PGSIZE) != 0)
    panic("mappages: va not aligned");
  if((size % PGSIZE) != 0)
    panic("mappages: size not aligned");

  a = va;
  last = va + size - PGSIZE;
  for(;;){
    // Find the PTE for virtual address 'a'
    if((pte = walk(pagetable, a, 1)) == 0)
      return -1;
    if(*pte & PTE_V)
      panic("mappages: remap");
    
    // Create the mapping
    *pte = PA2PTE(pa) | perm | PTE_V;
    
    if(a == last)
      break;
    a += PGSIZE;
    pa += PGSIZE;
  }
  return 0;
}

This function maps a range of virtual addresses to a range of physical addresses, setting the specified permissions.

Freeing Page Tables

The freewalk function recursively frees a page table:

void
freewalk(pagetable_t pagetable)
{
  // There are 512 PTEs in a page table
  for(int i = 0; i < 512; i++){
    pte_t pte = pagetable[i];
    if((pte & PTE_V) && (pte & (PTE_R|PTE_W|PTE_X)) == 0){
      // This PTE points to a lower-level page table
      uint64 child = PTE2PA(pte);
      freewalk((pagetable_t)child);
      pagetable[i] = 0;
    } else if(pte & PTE_V){
      panic("freewalk: leaf");
    }
  }
  kfree((void*)pagetable);
}

This function first frees all child page tables (recursively) and then frees the page table itself. Before freeing a user page table, xv6 uses uvmunmap to free the mapped physical pages.

Kernel vs. User Page Tables

xv6 maintains separate page tables for the kernel and each user process.

Kernel Page Table

The kernel page table is set up during boot by kvminit() which calls kvmmake():

pagetable_t
kvmmake(void)
{
  pagetable_t kpgtbl;

  kpgtbl = (pagetable_t) kalloc();
  memset(kpgtbl, 0, PGSIZE);

  // Map hardware devices
  kvmmap(kpgtbl, UART0, UART0, PGSIZE, PTE_R | PTE_W);
  kvmmap(kpgtbl, VIRTIO0, VIRTIO0, PGSIZE, PTE_R | PTE_W);
  kvmmap(kpgtbl, PLIC, PLIC, 0x4000000, PTE_R | PTE_W);

  // Map kernel text as read-only and executable
  kvmmap(kpgtbl, KERNBASE, KERNBASE, (uint64)etext-KERNBASE, PTE_R | PTE_X);

  // Map kernel data and physical RAM
  kvmmap(kpgtbl, (uint64)etext, (uint64)etext, PHYSTOP-(uint64)etext, PTE_R | PTE_W);

  // Map trampoline for trap entry/exit
  kvmmap(kpgtbl, TRAMPOLINE, (uint64)trampoline, PGSIZE, PTE_R | PTE_X);

  // Map kernel stacks
  proc_mapstacks(kpgtbl);
  
  return kpgtbl;
}

The kernel page table uses direct mapping (also called identity mapping) for most addresses, meaning virtual addresses map to identical physical addresses. This simplifies kernel memory management.

User Page Tables

Each process has its own page table, created by uvmcreate():

pagetable_t
uvmcreate()
{
  pagetable_t pagetable;
  pagetable = (pagetable_t) kalloc();
  if(pagetable == 0)
    return 0;
  memset(pagetable, 0, PGSIZE);
  return pagetable;
}

User page tables map the process’s virtual address space to physical memory, with the typical layout:

  • Starting at virtual address 0: text (code)
  • Followed by data and bss segments
  • Fixed-size stack
  • Expandable heap
  • Special mappings at high addresses:
    • TRAPFRAME: Stores process context during traps
    • TRAMPOLINE: Code that transitions between user and kernel mode

Hardware Page Table Support

The RISC-V hardware supports page tables through several mechanisms:

SATP Register

The Supervisor Address Translation and Protection (satp) register controls the MMU. It contains:

  • Mode (SV39)
  • ASID (Address Space ID, unused in xv6)
  • Physical address of the root page table

xv6 sets up this register using:

#define SATP_SV39 (8L << 60)
#define MAKE_SATP(pagetable) (SATP_SV39 | (((uint64)pagetable) >> 12))

static inline void 
w_satp(uint64 x)
{
  asm volatile("csrw satp, %0" : : "r" (x));
}

void
kvminithart()
{
  // Wait for any previous writes to page table memory to finish
  sfence_vma();
  
  // Load kernel page table
  w_satp(MAKE_SATP(kernel_pagetable));
  
  // Flush stale TLB entries
  sfence_vma();
}

TLB Management

The Translation Lookaside Buffer (TLB) caches virtual-to-physical address translations. When the page tables change, xv6 must invalidate the TLB using the sfence_vma instruction:

static inline void
sfence_vma()
{
  // The zero, zero means flush all TLB entries
  asm volatile("sfence.vma zero, zero");
}

Context Switching and Page Tables

When the OS switches between processes, it must also switch the active page table by updating the satp register. This is done in the swtch function and involves:

  1. Saving the current process state
  2. Updating the satp register to use the new process’s page table
  3. Flushing the TLB to remove stale entries

Conclusion

xv6’s implementation of page tables on RISC-V is a clean, educational example of modern virtual memory systems. The three-level page table structure enables efficient memory management through:

  1. On-demand allocation of page tables
  2. Fine-grained memory protection
  3. Isolation between processes
  4. Efficient translation from virtual to physical addresses

The design choices in xv6 prioritize simplicity and clarity over optimization, making it an excellent reference for understanding virtual memory concepts that apply to many modern operating systems.