Size Does Matter: Utilizing Base and Huge Pages Simultaneously

image

Two morsels mapping the same pages in different sizes

Context

Modern operating systems use virtual memory and paging to isolate processes and devices from each other. A virtual address space is defined by a tree of page tables. This means that translating a virtual into a physical address that can be used to actually address the memory requires multiple memory accesses which are expensive. Because of this, the CPU caches known translations in the translation lookaside buffer (TLB). The number of translations that can be saved in the TLB, also known as the TLB coverage, is, however, limited. It has a big impact on application performance.

To improve the TLB coverage, one can increase the page size and use Huge Pages (HP). On most architectures they are implented by skipping the last level of the address translation, which results in a HP size of 2MiB on Intel x86_64. While HPs can improve TLB coverage and, thus, application performance, they also come with drawbacks. One example for such a drawback becomes clear when using HPs with Copy-On-Write (COW). When a write fault is triggered, 2MiB have to be copied instead of 4KiB. Because of this, depending on the workload, COW can benefit from using Base instead of Huge Pages.

Problem

Morsels are self-contained virtual-memory objects. The main property that sets them apart is that they also make the page tables part of the object. This allows to efficiently share and unshare memory between processes and devices using morsels. Secondary morsels extend the morsel concept by support for simultaneously using multiple incompatible architectures. They currently require the incompatible architectures to have pages of the same size. In addition to this, it is not possible to use COW directly with a secondary morsel. Copy requests are redirected to the parent morsel instead.

Goal

The goal of this thesis is to extend secondary views by support for different page sizes and adapt the semantics of COW to work directly with secondaries. As a first step, you will implement the concept for x86 Base and Huge Pages. After this implementation is completed, the benefits of mixing different page sizes have to be evaluated. Redis and its persistence feature may be one possibility for an evaluation target.

Schedule

Your thesis will follow these key steps:

  1. Getting started: Familiarize yourself with kernel development, set up a suitable development environment, and establish a functional test setup.
  2. Basic implementation: Implement a basic version. The parent should have 2MiB and the secondary 4KiB pages. With the basic implementation, it should be possible to create, map, unmap and destroy the parent and its secondary.
  3. Full feature support: Extend the basic implementation to support population, eviction and COW.
  4. Extending the semantics: Extend the COW semantics of secondary morsels to support COW of a secondary directly instead of redirecting it to the parent. To simplify the scenario, the parent can be set to read-only in this case.
  5. Evaluation: Evaluate the simultaneous usage of different page sizes using synthetic workloads. If there is still time, include real world applications (e.g. redis with persistence).

References

Huge Pages Most architectures support more than one page size at a time. Most commonly by skipping the last layer of page tables.

Papers

DIMES Workshop
Morsels: Explicit Virtual Memory Objects
Alexander Halbuer, Christian Dietrich, Florian Rommel, Daniel LohmannProceedings of the 1st Workshop on Disruptive Memory SystemsAssociation for Computing Machinery2023.
PDF Details Slides 10.1145/3609308.3625267 [BibTex]

Multi-Target Virtual-Memory Objects

Paging has established as goto solution for memory virtualization, but actual implementations differ. Multiple sychronized views could fill the gap to allow direct sharing between different domains. [PDF]

 
Typ
Masterarbeit

 
Status
abgeschlossen

 
Supervisors
Alexander Halbuer
Daniel Lohmann

 
Project
ParPerOS

 
Bearbeiter
Nils Fuhler (abgegeben: 12. Sep 2025)