[PATCH 2/2] arm64/mm: Improve comment in contpte_ptep_get_lockless()

David Hildenbrand david at redhat.com
Mon Feb 26 04:40:54 PST 2024


On 26.02.24 13:37, Ryan Roberts wrote:
> On 26/02/2024 12:30, David Hildenbrand wrote:
>> On 26.02.24 13:03, Ryan Roberts wrote:
>>> Make clear the atmicity/consistency requirements of the API and how we
>>> achieve them.
>>>
>>> Link: https://lore.kernel.org/linux-mm/Zc-Tqqfksho3BHmU@arm.com/
>>> Signed-off-by: Ryan Roberts <ryan.roberts at arm.com>
>>> ---
>>>    arch/arm64/mm/contpte.c | 24 ++++++++++++++----------
>>>    1 file changed, 14 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
>>> index be0a226c4ff9..1b64b4c3f8bf 100644
>>> --- a/arch/arm64/mm/contpte.c
>>> +++ b/arch/arm64/mm/contpte.c
>>> @@ -183,16 +183,20 @@ EXPORT_SYMBOL_GPL(contpte_ptep_get);
>>>    pte_t contpte_ptep_get_lockless(pte_t *orig_ptep)
>>>    {
>>>        /*
>>> -     * Gather access/dirty bits, which may be populated in any of the ptes
>>> -     * of the contig range. We may not be holding the PTL, so any contiguous
>>> -     * range may be unfolded/modified/refolded under our feet. Therefore we
>>> -     * ensure we read a _consistent_ contpte range by checking that all ptes
>>> -     * in the range are valid and have CONT_PTE set, that all pfns are
>>> -     * contiguous and that all pgprots are the same (ignoring access/dirty).
>>> -     * If we find a pte that is not consistent, then we must be racing with
>>> -     * an update so start again. If the target pte does not have CONT_PTE
>>> -     * set then that is considered consistent on its own because it is not
>>> -     * part of a contpte range.
>>> +     * The ptep_get_lockless() API requires us to read and return *orig_ptep
>>> +     * so that it is self-consistent, without the PTL held, so we may be
>>> +     * racing with other threads modifying the pte. Usually a READ_ONCE()
>>> +     * would suffice, but for the contpte case, we also need to gather the
>>> +     * access and dirty bits from across all ptes in the contiguous block,
>>> +     * and we can't read all of those neighbouring ptes atomically, so any
>>> +     * contiguous range may be unfolded/modified/refolded under our feet.
>>> +     * Therefore we ensure we read a _consistent_ contpte range by checking
>>> +     * that all ptes in the range are valid and have CONT_PTE set, that all
>>> +     * pfns are contiguous and that all pgprots are the same (ignoring
>>> +     * access/dirty). If we find a pte that is not consistent, then we must
>>> +     * be racing with an update so start again. If the target pte does not
>>> +     * have CONT_PTE set then that is considered consistent on its own
>>> +     * because it is not part of a contpte range.
>>>         */
>>>          pgprot_t orig_prot;
>>
>> Reviewed-by: David Hildenbrand <david at redhat.com>
> 
> Thanks!
> 
>>
>> In an ideal world, we'd really not rely on any accessed/dirty on the lockless
>> path and remove contpte_ptep_get_lockless() completely :)
> 
> Not sure if you saw my RFC to do exactly that? (well, it doesn't actually remove
> [contpte_]ptep_get_lockless() but it does remove all the callers). If you have
> any feedback, we could get this moving...

Yes, I saw it. Hoping we can get that in and then maybe remove the 
contpte_get_lockless() once all callers are gone :)

... on my todo list.

-- 
Cheers,

David / dhildenb




More information about the linux-arm-kernel mailing list