[RFC PATCH v5 00/18] pkeys-based page table hardening
Yang Shi
yang at os.amperecomputing.com
Tue Aug 26 12:18:35 PDT 2025
On 8/25/25 12:31 AM, Kevin Brodsky wrote:
> On 21/08/2025 19:29, Yang Shi wrote:
>> Hi Kevin,
>>
>> On 8/15/25 1:54 AM, Kevin Brodsky wrote:
>>> This is a proposal to leverage protection keys (pkeys) to harden
>>> critical kernel data, by making it mostly read-only. The series includes
>>> a simple framework called "kpkeys" to manipulate pkeys for in-kernel
>>> use,
>>> as well as a page table hardening feature based on that framework,
>>> "kpkeys_hardened_pgtables". Both are implemented on arm64 as a proof of
>>> concept, but they are designed to be compatible with any architecture
>>> that supports pkeys.
>> [...]
>>
>>> Note: the performance impact of set_memory_pkey() is likely to be
>>> relatively low on arm64 because the linear mapping uses PTE-level
>>> descriptors only. This means that set_memory_pkey() simply changes the
>>> attributes of some PTE descriptors. However, some systems may be able to
>>> use higher-level descriptors in the future [5], meaning that
>>> set_memory_pkey() may have to split mappings. Allocating page tables
>> I'm supposed the page table hardening feature will be opt-in due to
>> its overhead? If so I think you can just keep kernel linear mapping
>> using PTE, just like debug page alloc.
> Indeed, I don't expect it to be turned on by default (in defconfig). If
> the overhead proves too large when block mappings are used, it seems
> reasonable to force PTE mappings when kpkeys_hardened_pgtables is enabled.
>
>>> from a contiguous cache of pages could help minimise the overhead, as
>>> proposed for x86 in [1].
>> I'm a little bit confused about how this can work. The contiguous
>> cache of pages should be some large page, for example, 2M. But the
>> page table pages allocated from the cache may have different
>> permissions if I understand correctly. The default permission is RO,
>> but some of them may become R/W at sometime, for example, when calling
>> set_pte_at(). You still need to split the linear mapping, right?
> When such a helper is called, *all* PTPs become writeable - there is no
> per-PTP permission switching.
OK, so all PTPs in the same contiguous cache will become writeable even
though the helper (i.e. set_pte_at()) is just called on one of the
PTPs. But doesn't it compromise the page table hardening somehow? The
PTPs from the same cache may belong to different processes.
Thanks,
Yang
>
> PTPs remain mapped RW (i.e. the base permissions set at the PTE level
> are RW). With this series, they are also all mapped with the same pkey
> (1). By default, the pkey register is configured so that pkey 1 provides
> RO access. The net result is that PTPs are RO by default, since the pkey
> restricts the effective permissions.
>
> When calling e.g. set_pte(), the pkey register is modified to enable RW
> access to pkey 1, making it possible to write to any PTP. Its value is
> restored when the function exit so that PTPs are once again RO.
>
> - Kevin
More information about the linux-arm-kernel
mailing list