[PATCH 1/8] mm: Add ptep_try_install() for lockless empty-slot installs
David Hildenbrand (Arm)
david at kernel.org
Wed May 20 14:29:07 PDT 2026
On 5/20/26 20:38, Tejun Heo wrote:
> On Wed, May 20, 2026 at 10:41:15AM +0200, David Hildenbrand (Arm) wrote:
>> And that should be carefully documented.
>
> Will do.
>
>>> The fault handler has to install _some_ page and let kernel continue.
>>> Scratch page or arena page doesn't matter. Potentially different CPUs
>>> will see different page. It's not a concern at all.
>>> bpf prog is buggy, but the kernel will continue to work without a glitch.
>>> bpf runtime will disable and unload misbehaving prog.
>>
>> Having one page table walker overwrite a scratch page on race is just rather ...
>> questionable locking design, that just screams for problems long-term, really.
>>
>> At least in apply_range_clear_cb() one could similarly switch to
>> ptep_try_install() to at least have both these paths handle races in a
>> reasonable way. (having to handle when ptep_try_install() is not really implemented)
>
> Hmm... wouldn't that be more confusing on apply_range_clear_cb() side?
> Whether it maintains the current behavior (if collide with scratch, try
> again with scratch as the original value) or flip the behavior (fail if
> scratch), that extra logic is spurious, and those tend to confuse people as
> they force asking why it has to be that specific way. If the goal is
> documenting the subtlety, wouldn't a detailed comment serve the purpose
> better?
I would let all operations that can happen concurrently work with atomics, not
just a single one.
Alexei correctly concluded that ptep_get_and_clear() might be one option that
should do on x86 and arm64 from a quick glimpse.
>
>> Anyhow, the documentation of ptep_try_install() must clearly spell out that this
>> must be used very carefully, and only in special kernel page tables, never user
>> page tables. There are likely other scenarios we should document (caller must
>> prevent concurrent page table teardown somehow, and must be prepared to handle
>> races if other code is not using atomics).
>>
>> To highlight that, we should likely consider adding a "kernel" in the name, like
>> "ptep_try_install_kernel()".
>>
>> I am also not sure if "install" is the right terminology and whether it should
>> instead be "ptep_try_set()". (set_pte_at is the non-atomic interface right now)
>
> So, ptep_try_set_kernel()?
I guess we could do that, but we don't have a lot of existing functions that are
specific to kernel page tables ... only pte_offset_kernel() and
pte_alloc_one_kernel()/pte_free_kernel().
The most important part is getting the documentation right.
We should probably document that code concurrently clearing PTEs should be using
ptep_get_and_clear().
--
Cheers,
David
More information about the linux-arm-kernel
mailing list