[PATCH 2/2] arm64/mm: Improve comment in contpte_ptep_get_lockless()
David Hildenbrand
david at redhat.com
Mon Feb 26 04:40:54 PST 2024
On 26.02.24 13:37, Ryan Roberts wrote:
> On 26/02/2024 12:30, David Hildenbrand wrote:
>> On 26.02.24 13:03, Ryan Roberts wrote:
>>> Make clear the atmicity/consistency requirements of the API and how we
>>> achieve them.
>>>
>>> Link: https://lore.kernel.org/linux-mm/Zc-Tqqfksho3BHmU@arm.com/
>>> Signed-off-by: Ryan Roberts <ryan.roberts at arm.com>
>>> ---
>>> arch/arm64/mm/contpte.c | 24 ++++++++++++++----------
>>> 1 file changed, 14 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
>>> index be0a226c4ff9..1b64b4c3f8bf 100644
>>> --- a/arch/arm64/mm/contpte.c
>>> +++ b/arch/arm64/mm/contpte.c
>>> @@ -183,16 +183,20 @@ EXPORT_SYMBOL_GPL(contpte_ptep_get);
>>> pte_t contpte_ptep_get_lockless(pte_t *orig_ptep)
>>> {
>>> /*
>>> - * Gather access/dirty bits, which may be populated in any of the ptes
>>> - * of the contig range. We may not be holding the PTL, so any contiguous
>>> - * range may be unfolded/modified/refolded under our feet. Therefore we
>>> - * ensure we read a _consistent_ contpte range by checking that all ptes
>>> - * in the range are valid and have CONT_PTE set, that all pfns are
>>> - * contiguous and that all pgprots are the same (ignoring access/dirty).
>>> - * If we find a pte that is not consistent, then we must be racing with
>>> - * an update so start again. If the target pte does not have CONT_PTE
>>> - * set then that is considered consistent on its own because it is not
>>> - * part of a contpte range.
>>> + * The ptep_get_lockless() API requires us to read and return *orig_ptep
>>> + * so that it is self-consistent, without the PTL held, so we may be
>>> + * racing with other threads modifying the pte. Usually a READ_ONCE()
>>> + * would suffice, but for the contpte case, we also need to gather the
>>> + * access and dirty bits from across all ptes in the contiguous block,
>>> + * and we can't read all of those neighbouring ptes atomically, so any
>>> + * contiguous range may be unfolded/modified/refolded under our feet.
>>> + * Therefore we ensure we read a _consistent_ contpte range by checking
>>> + * that all ptes in the range are valid and have CONT_PTE set, that all
>>> + * pfns are contiguous and that all pgprots are the same (ignoring
>>> + * access/dirty). If we find a pte that is not consistent, then we must
>>> + * be racing with an update so start again. If the target pte does not
>>> + * have CONT_PTE set then that is considered consistent on its own
>>> + * because it is not part of a contpte range.
>>> */
>>> pgprot_t orig_prot;
>>
>> Reviewed-by: David Hildenbrand <david at redhat.com>
>
> Thanks!
>
>>
>> In an ideal world, we'd really not rely on any accessed/dirty on the lockless
>> path and remove contpte_ptep_get_lockless() completely :)
>
> Not sure if you saw my RFC to do exactly that? (well, it doesn't actually remove
> [contpte_]ptep_get_lockless() but it does remove all the callers). If you have
> any feedback, we could get this moving...
Yes, I saw it. Hoping we can get that in and then maybe remove the
contpte_get_lockless() once all callers are gone :)
... on my todo list.
--
Cheers,
David / dhildenb
More information about the linux-arm-kernel
mailing list