Skip to main content Link Menu Expand (external link) Document Search Copy Copied

CS326 Project 04: Implementing mmap() in xv6

Practice Exam Problems with Solutions

Easy Problems

  1. Memory Mapping Basics What is the primary purpose of the shared memory implementation of mmap() in the xv6 operating system? Explain what functionality it provides.

    Solution: The primary purpose of the shared memory implementation of mmap() in xv6 is to allow multiple processes to share the same physical memory page. This enables direct inter-process communication, where one process can write data to the shared memory region and another process can read it. The implementation is simplified compared to POSIX mmap() and focuses specifically on shared memory between processes rather than file mapping.

  2. Page Table Structure Describe the three-level page table structure used in xv6-RISC-V. What are the key components of a page table entry (PTE), and which permission bits are most relevant for shared memory?

    Solution: Xv6 on RISC-V uses a three-level page table structure following the Sv39 standard:

    • Level 2: Top level, indexes bits 30-38 of the virtual address
    • Level 1: Middle level, indexes bits 21-29 of the virtual address
    • Level 0: Bottom level, indexes bits 12-20 of the virtual address

    Each page table contains 512 (2^9) entries, with each PTE being 64 bits. Key components include:

    • Physical Page Number (PPN): Points to the next level page table or physical page
    • Permission flags: V (valid), R (read), W (write), X (execute), U (user-mode access)

    For shared memory, the most relevant permission bits are:

    • V: Must be set to indicate a valid mapping
    • R and W: Set to allow reading and writing to the shared memory
    • U: Set to allow user-mode access to the shared memory
  3. System Call Interface What are the two system calls implemented for shared memory management in xv6? Describe the parameters and return values for each.

    Solution:

    1. uint64 mmap(void):
      • Parameters: None in this simplified implementation
      • Return value: Returns the virtual address (SHMEM_REGION) of the mapped page on success, or 0 on failure
    2. int munmap(uint64 addr):
      • Parameters: addr - the virtual address of the shared memory to unmap
      • Return value: Returns 0 on success, -1 on failure (if the address is invalid or not mapped)
  4. Memory Region Layout In the xv6 mmap implementation, where is the shared memory region located in the process address space? Why is this location chosen?

    Solution: The shared memory region in xv6 is located at address 0x4000000 (64MB mark), defined by SHMEM_REGION in vm.c. This location is chosen because it’s above the typical heap and stack usage in xv6, providing a dedicated area for shared memory that doesn’t conflict with other memory regions. It’s in a region of the address space that isn’t normally used by the process, reducing the chance of conflicts with normal memory allocation.

  5. Physical Memory Management Explain how the system determines when to allocate or free the physical memory used for shared memory. What mechanism ensures memory is not freed prematurely?

    Solution: The system uses reference counting to determine when to allocate or free physical memory:

    • The first process to call mmap() causes a physical page to be allocated and the reference count set to 1
    • Subsequent processes mapping the same shared page increment the reference count
    • When a process unmaps the page (via munmap() or exit), the reference count is decremented
    • The physical memory is only freed when the reference count reaches zero, ensuring that memory isn’t freed while it’s still in use by other processes

    This reference counting mechanism ensures memory is not freed prematurely, as the physical page is only released when no processes are using it anymore.

  6. Inter-Process Communication How does the shared memory implementation enable communication between a parent and child process? What happens during a fork() operation?

    Solution: The shared memory implementation enables communication between parent and child processes by mapping the same physical memory page into both processes’ address spaces. During fork():

    1. The uvmcopy() function detects when it’s copying the SHMEM_REGION address
    2. Instead of creating a new copy of the physical page (as it does for regular memory), it maps the same physical page into the child’s address space
    3. The reference count for the shared page is incremented to account for the new mapping
    4. Both parent and child now have access to the same physical memory, allowing one to write data that the other can immediately read
  7. System Structure Name and briefly describe the key files that need to be modified to implement the mmap functionality in xv6.

    Solution: The key files that need to be modified are:

    1. kernel/defs.h: Add function prototypes for shared memory operations
    2. kernel/syscall.h: Add new syscall numbers for mmap and munmap
    3. kernel/syscall.c: Register the new syscalls in the syscall table
    4. kernel/vm.c: Implement shared page memory management functions including the core mmap/munmap logic and modifications to uvmcopy and uvmunmap
    5. kernel/sysproc.c: Implement syscall handlers to interface with the core functions
    6. kernel/main.c: Initialize the shared memory system
    7. user/user.h: Add user-side declarations for the syscalls
    8. user/usys.pl: Add syscall entries to generate user stubs
    9. Makefile: Add mmaptest to the build process

Moderate Problems

  1. Implementation Design How does the shmem_page structure track shared memory pages? List its fields and explain the purpose of each.

    Solution: The shmem_page structure tracks shared memory pages with the following fields:

    • uint64 pa: Physical address of the shared page - stores the actual memory location
    • int refcount: Reference count - tracks how many processes are currently mapping this page
    • struct spinlock lock: Lock to protect access - ensures atomic operations on the structure
    • int allocated: Whether the page is allocated - indicates if the physical memory has been allocated

    This structure acts as a global registry for the single shared memory page, maintaining necessary information about its state and usage. The lock ensures that updates to the reference count and allocation status are atomic, preventing race conditions between processes.

  2. Reference Counting Explain how reference counting is used in the mmap implementation. What happens when a process unmaps a shared memory region that is still in use by other processes?

    Solution: Reference counting in the mmap implementation works as follows:

    • When a process calls mmap(), the reference count is incremented
    • When a process calls fork() and the shared page is copied, the reference count is incremented
    • When a process unmaps the page via munmap() or exits, the reference count is decremented

    When a process unmaps a shared memory region that is still in use by other processes:

    1. The mapping is removed from the process’s page table
    2. The reference count is decremented
    3. The physical memory is NOT freed because the reference count is still > 0
    4. Other processes can continue to use the shared memory

    This ensures that the physical page remains allocated as long as at least one process is still using it.

  3. Error Handling What are the possible failure points in the mmap() and munmap() functions? How does the implementation handle these potential failures?

    Solution: Possible failure points in mmap():

    • kalloc() might fail if there’s no free physical memory
    • mappages() might fail if there’s no memory for page table pages

    Error handling in mmap():

    • If kalloc() fails, it returns 0 (NULL) and mmap() returns 0 to indicate failure
    • If mappages() fails, it decrements the reference count, potentially frees the page if no longer needed, and returns 0

    Possible failure points in munmap():

    • The address might not be the valid SHMEM_REGION
    • The page at the address might not be mapped (PTE doesn’t exist or isn’t valid)

    Error handling in munmap():

    • If the address isn’t SHMEM_REGION, it returns -1
    • If the PTE is null or invalid, it returns -1
    • Otherwise, it removes the mapping, decrements the reference count, and returns 0
  4. Synchronization Why are locks necessary in the shared memory implementation? Identify the critical sections where race conditions might occur without proper synchronization.

    Solution: Locks are necessary because multiple processes can simultaneously access and modify the shared memory metadata. Critical sections that require synchronization include:

    1. In mmap():
      • Checking if the page is already allocated and incrementing the reference count
      • Allocating a new page when none exists
      • Decrementing the reference count if mappages() fails
    2. In munmap():
      • Decrementing the reference count
      • Checking if the reference count is zero and freeing the page
    3. In uvmcopy() (fork):
      • Incrementing the reference count when mapping the shared page in the child
    4. In uvmunmap() (cleanup):
      • Decrementing the reference count
      • Checking if the reference count is zero and freeing the page

    Without locks, race conditions could lead to incorrect reference counts, premature freeing of memory, or memory leaks.

  5. Page Table Modification Explain how the implementation modifies the page tables to map the same physical page to multiple processes. Which function is used to add mappings to a page table?

    Solution: The implementation modifies page tables to map the same physical page to multiple processes by:

    1. In mmap():
      • It uses the mappages() function to create a mapping in the current process’s page table
      • This mapping points to the physical address stored in shmem_page.pa
      • The mapping has permissions PTE_RPTE_WPTE_U (read, write, user access)
    2. In uvmcopy() during fork():
      • It detects the SHMEM_REGION address in the parent’s page table
      • Instead of allocating a new page, it maps the same physical page (identified by pa)
      • It uses mappages() to create this mapping in the child’s page table with the same permissions

    The mappages() function is used to add these mappings to a page table. It creates page table entries (PTEs) that point to the same physical address but in different processes’ page tables, effectively sharing the physical memory.

  6. Memory Cleanup What modifications are made to the process cleanup routines to properly handle shared memory? How does the system ensure that shared pages are only freed when they are no longer used?

    Solution: The uvmunmap() function, which is called during process cleanup, is modified to properly handle shared memory:

    1. It checks if the address being unmapped corresponds to SHMEM_REGION
    2. If it’s the shared memory region, it:
      • Checks if the physical address matches shmem_page.pa
      • Acquires the lock to safely modify the reference count
      • Decrements the reference count
      • Only frees the physical page if the reference count reaches zero
      • Updates the shmem_page structure to mark the page as unallocated if freed

    This ensures that shared pages are only freed when they are no longer referenced by any process, preventing both memory leaks (where shared pages are never freed) and use-after-free bugs (where shared pages are freed while still in use).

  7. Test Program Describe the steps the mmaptest program takes to verify that shared memory is working correctly. What specific functionality does it test?

    Solution: The mmaptest program verifies shared memory functionality through these steps:

    1. It calls mmap() to obtain a shared memory region
    2. It writes an integer value (42) to the shared memory
    3. It forks a child process
    4. The child process reads the value from shared memory to verify it can see parent’s write (42)
    5. The child modifies the value in shared memory (to 100)
    6. The child calls munmap() to unmap its view of shared memory
    7. After the child exits, the parent reads the value to verify it can see the child’s modification (100)
    8. The parent calls munmap() to unmap its view of shared memory

    This tests several key aspects:

    • Basic mapping functionality (creating a shared mapping)
    • Data visibility between parent and child processes
    • The persistence of modifications across processes
    • Proper cleanup of shared memory through munmap()
    • Reference counting (ensuring memory remains valid after child unmaps but parent still uses it)

      Hard Problems

  8. Implementation Analysis The current implementation only supports a single shared memory page at a fixed address. Design an extension to the current implementation that would support multiple shared memory regions at dynamically determined addresses. What data structures would you need to modify or add?

    Solution: To support multiple shared memory regions at dynamically determined addresses, we would need:

    1. Replace the single shmem_page structure with a more flexible data structure:
      struct shmem_region {
        uint64 va;              // Virtual address where mapped
        uint64 pa;              // Physical address of the shared page
        int refcount;           // Reference count
        int size;               // Size of region (could be multiple pages)
        struct shmem_region *next;  // Link to next region or NULL
      };
             
      struct {
        struct spinlock lock;            // Lock for the whole system
        struct shmem_region *regions;    // Linked list of regions
      } shmem_system;
      
    2. Modify the mmap() API to support dynamic address allocation:
      uint64 mmap(uint64 len, int prot, int flags, int id);
      
      • len: Length of the region to map
      • prot: Protection flags (read/write/execute)
      • flags: Flags to control behavior
      • id: An identifier for the shared region (0 for anonymous, positive for named regions)
    3. Add functions to:
      • Find a free virtual address range of appropriate size
      • Allocate and track multiple physical pages if needed
      • Manage the linked list of shared regions
    4. Update uvmcopy() and uvmunmap() to work with the list structure:
      • Traverse the list to find regions that need special handling
      • Apply reference counting to each region individually
    5. Implement a region lookup function to find a region by id or address

    This design would allow multiple independent shared regions while maintaining proper reference counting for each.