[PATCH 0/4] arm64: mm: support dynamic vmalloc/pmd configuration
Christophe Leroy
christophe.leroy at csgroup.eu
Tue Feb 20 23:32:09 PST 2024
Le 20/02/2024 à 21:32, Maxwell Bland a écrit :
> [Vous ne recevez pas souvent de courriers de mbland at motorola.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
>
> Reworks ARM's virtual memory allocation infrastructure to support
> dynamic enforcement of page middle directory PXNTable restrictions
> rather than only during the initial memory mapping. Runtime enforcement
> of this bit prevents write-then-execute attacks, where malicious code is
> staged in vmalloc'd data regions, and later the page table is changed to
> make this code executable.
>
> Previously the entire region from VMALLOC_START to VMALLOC_END was
> vulnerable, but now the vulnerable region is restricted to the 2GB
> reserved by module_alloc, a region which is generally read-only and more
> difficult to inject staging code into, e.g., data must pass the BPF
> verifier. These changes also set the stage for other systems, such as
> KVM-level (EL2) changes to mark page tables immutable and code page
> verification changes, forging a path toward complete mitigation of
> kernel exploits on ARM.
>
> Implementing this required minimal changes to the generic vmalloc
> interface in the kernel to allow architecture overrides of some vmalloc
> wrapper functions, refactoring vmalloc calls to use a standard interface
> in the generic kernel, and passing the address parameter already passed
> into PTE allocation to the pte_allocate child function call.
>
> The new arm64 vmalloc wrapper functions ensure vmalloc data is not
> allocated into the region reserved for module_alloc. arm64 BPF and
> kprobe code also see a two-line-change ensuring their allocations abide
> by the segmentation of code from data. Finally, arm64's pmd_populate
> function is modified to set the PXNTable bit appropriately.
On powerpc (book3s/32) we have more or less the same although it is not
directly linked to PMDs: the virtual 4G address space is split in
segments of 256M. On each segment there's a bit called NX to forbit
execution. Vmalloc space is allocated in a segment with NX bit set while
Module spare is allocated in a segment with NX bit unset. We never have
to override vmalloc wrappers. All consumers of exec memory allocate it
using module_alloc() while vmalloc() provides non-exec memory.
For modules, all you have to do is select
ARCH_WANTS_MODULES_DATA_IN_VMALLOC and module data will be allocated
using vmalloc() hence non-exec memory in our case.
Christophe
More information about the linux-riscv
mailing list