[PATCH 0/4] arm64: mm: support dynamic vmalloc/pmd configuration

Christophe Leroy christophe.leroy at csgroup.eu
Tue Feb 20 23:32:09 PST 2024



Le 20/02/2024 à 21:32, Maxwell Bland a écrit :
> [Vous ne recevez pas souvent de courriers de mbland at motorola.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> 
> Reworks ARM's virtual memory allocation infrastructure to support
> dynamic enforcement of page middle directory PXNTable restrictions
> rather than only during the initial memory mapping. Runtime enforcement
> of this bit prevents write-then-execute attacks, where malicious code is
> staged in vmalloc'd data regions, and later the page table is changed to
> make this code executable.
> 
> Previously the entire region from VMALLOC_START to VMALLOC_END was
> vulnerable, but now the vulnerable region is restricted to the 2GB
> reserved by module_alloc, a region which is generally read-only and more
> difficult to inject staging code into, e.g., data must pass the BPF
> verifier. These changes also set the stage for other systems, such as
> KVM-level (EL2) changes to mark page tables immutable and code page
> verification changes, forging a path toward complete mitigation of
> kernel exploits on ARM.
> 
> Implementing this required minimal changes to the generic vmalloc
> interface in the kernel to allow architecture overrides of some vmalloc
> wrapper functions, refactoring vmalloc calls to use a standard interface
> in the generic kernel, and passing the address parameter already passed
> into PTE allocation to the pte_allocate child function call.
> 
> The new arm64 vmalloc wrapper functions ensure vmalloc data is not
> allocated into the region reserved for module_alloc. arm64 BPF and
> kprobe code also see a two-line-change ensuring their allocations abide
> by the segmentation of code from data. Finally, arm64's pmd_populate
> function is modified to set the PXNTable bit appropriately.

On powerpc (book3s/32) we have more or less the same although it is not 
directly linked to PMDs: the virtual 4G address space is split in 
segments of 256M. On each segment there's a bit called NX to forbit 
execution. Vmalloc space is allocated in a segment with NX bit set while 
Module spare is allocated in a segment with NX bit unset. We never have 
to override vmalloc wrappers. All consumers of exec memory allocate it 
using module_alloc() while vmalloc() provides non-exec memory.

For modules, all you have to do is select 
ARCH_WANTS_MODULES_DATA_IN_VMALLOC and module data will be allocated 
using vmalloc() hence non-exec memory in our case.

Christophe


More information about the linux-riscv mailing list