[PATCH v3] mm: Fix race between __split_huge_pmd_locked() and GUP-fast

Felix Kuehling felix.kuehling at amd.com
Wed Aug 28 15:21:47 PDT 2024


On 2024-05-02 09:47, Ryan Roberts wrote:
> On 02/05/2024 14:08, David Hildenbrand wrote:
>> On 01.05.24 16:33, Ryan Roberts wrote:
>>> __split_huge_pmd_locked() can be called for a present THP, devmap or
>>> (non-present) migration entry. It calls pmdp_invalidate()
>>> unconditionally on the pmdp and only determines if it is present or not
>>> based on the returned old pmd. This is a problem for the migration entry
>>> case because pmd_mkinvalid(), called by pmdp_invalidate() must only be
>>> called for a present pmd.
>>>
>>> On arm64 at least, pmd_mkinvalid() will mark the pmd such that any
>>> future call to pmd_present() will return true. And therefore any
>>> lockless pgtable walker could see the migration entry pmd in this state
>>> and start interpretting the fields as if it were present, leading to
>>> BadThings (TM). GUP-fast appears to be one such lockless pgtable walker.
>>>
>>> x86 does not suffer the above problem, but instead pmd_mkinvalid() will
>>> corrupt the offset field of the swap entry within the swap pte. See link
>>> below for discussion of that problem.
>> Could that explain:
>>
>> https://lore.kernel.org/all/YjoGbhreg8lGCGIJ@linutronix.de/
>>
>> Where the PFN of a migration entry might have been corrupted?
> Ahh interesting! Yes, it seems to fit...
>
>> Ccing Felix
> Are you able to reliably reproduce the bug, Felix? If so, would you mind trying
> with this patch to see if it goes away?

Sorry, this question got lost and I found it while cleaning my inbox. 
The team that ran into this problems reported that it was a one-off 
failure. They didn't see this problem again. So I can't help with 
verifying the fix.

Regards,
   Felix


>
>> Patch itself looks good to me
>>
>> Acked-by: David Hildenbrand <david at redhat.com>
> Thanks!
>



More information about the linux-arm-kernel mailing list