[PATCH v2 4/6] mm: add a batched helper to clear the young flag for large folios
Baolin Wang
baolin.wang at linux.alibaba.com
Mon Mar 2 18:36:37 PST 2026
On 3/2/26 5:07 PM, David Hildenbrand (Arm) wrote:
> On 2/27/26 10:44, Baolin Wang wrote:
>> Currently, MGLRU will call ptep_test_and_clear_young_notify() to check and
>> clear the young flag for each PTE sequentially, which is inefficient for
>> large folios reclamation.
>>
>> Moreover, on Arm64 architecture, which supports contiguous PTEs, the Arm64-
>> specific ptep_test_and_clear_young() already implements an optimization to
>> clear the young flags for PTEs within a contiguous range. However, this is not
>> sufficient. Similar to the Arm64 specific clear_flush_young_ptes(), we can
>> extend this to perform batched operations for the entire large folio (which
>> might exceed the contiguous range: CONT_PTE_SIZE).
>>
>> Thus, we can introduce a new batched helper: test_and_clear_young_ptes() and
>> its wrapper test_and_clear_young_ptes_notify() which are consistent with the
>> existing functions, to perform batched checking of the young flags for large
>> folios, which can help improve performance during large folio reclamation when
>> MGLRU is enabled. And it will be overridden by the architecture that implements
>> a more efficient batch operation in the following patches.
>>
>> Signed-off-by: Baolin Wang <baolin.wang at linux.alibaba.com>
>> ---
>> include/linux/pgtable.h | 38 ++++++++++++++++++++++++++++++++++++++
>> mm/internal.h | 16 +++++++++++-----
>> 2 files changed, 49 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>> index 776993d4567b..29bd9fd04e1e 100644
>> --- a/include/linux/pgtable.h
>> +++ b/include/linux/pgtable.h
>> @@ -1103,6 +1103,44 @@ static inline int clear_flush_young_ptes(struct vm_area_struct *vma,
>> }
>> #endif
>>
>> +#ifndef test_and_clear_young_ptes
>> +/**
>> + * test_and_clear_young_ptes - Mark PTEs that map consecutive pages of the same
>> + * folio as old
>> + * @vma: The virtual memory area the pages are mapped into.
>> + * @addr: Address the first page is mapped at.
>> + * @ptep: Page table pointer for the first entry.
>> + * @nr: Number of entries to clear access bit.
>> + *
>> + * May be overridden by the architecture; otherwise, implemented as a simple
>> + * loop over ptep_test_and_clear_young().
>> + *
>> + * Note that PTE bits in the PTE range besides the PFN can differ. For example,
>> + * some PTEs might be write-protected.
>> + *
>> + * Context: The caller holds the page table lock. The PTEs map consecutive
>> + * pages that belong to the same folio. The PTEs are all in the same PMD.
>> + *
>> + * Returns: whether any PTE was young.
>> + */
>> +static inline int test_and_clear_young_ptes(struct vm_area_struct *vma,
>> + unsigned long addr, pte_t *ptep,
>> + unsigned int nr)
>
> Two tabs ...
Ah, yes, not sure why I missed this one :(
> What happened to using a boolen as return type and for "int young"?
As I replied to you previously [1], I’d like to do this in a follow-up
patchset that converts all functions that check the young flag. Does
that sound OK to you?
[1]
https://lore.kernel.org/all/32c538ce-6af8-48a8-86fc-d26ee253af54@linux.alibaba.com/
More information about the linux-arm-kernel
mailing list