[PATCH 3/5] mm: add a batched helper to clear the young flag for large folios

David Hildenbrand (Arm) david at kernel.org
Wed Feb 25 06:04:17 PST 2026


On 2/24/26 02:56, Baolin Wang wrote:
> Currently, MGLRU will call ptep_clear_young_notify() to check and clear the
> young flag for each PTE sequentially, which is inefficient for large folios
> reclamation.
> 
> Moreover, on Arm64 architecture, which supports contiguous PTEs, the Arm64-
> specific ptep_test_and_clear_young() already implements an optimization to
> clear the young flags for PTEs within a contiguous range. However, this is not
> sufficient. Similar to the Arm64 specific clear_flush_young_ptes(), we can
> extend this to perform batched operations for the entire large folio (which
> might exceed the contiguous range: CONT_PTE_SIZE).
> 
> Thus, we can introduce a new batched helper: test_and_clear_young_ptes() and
> its wrapper clear_young_ptes_notify(), to perform batched checking of the young
> flags for large folios, which can help improve performance during large folio
> reclamation when MGLRU is enabled. And it will be overridden by the architecture
> that implements a more efficient batch operation in the following patches.
> 

Maybe mention that the implementation follows the other existing functions.

> Signed-off-by: Baolin Wang <baolin.wang at linux.alibaba.com>
> ---
>  include/linux/pgtable.h | 36 ++++++++++++++++++++++++++++++++++++
>  mm/internal.h           | 23 ++++++++++++++++++-----
>  2 files changed, 54 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index 776993d4567b..0bcd3be524d3 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -1103,6 +1103,42 @@ static inline int clear_flush_young_ptes(struct vm_area_struct *vma,
>  }
>  #endif
>  
> +#ifndef test_and_clear_young_ptes
> +/**
> + * test_and_clear_young_ptes - Mark PTEs that map consecutive pages of the same
> + *			       folio as old
> + * @vma: The virtual memory area the pages are mapped into.
> + * @addr: Address the first page is mapped at.
> + * @ptep: Page table pointer for the first entry.
> + * @nr: Number of entries to clear access bit.
> + *
> + * May be overridden by the architecture; otherwise, implemented as a simple
> + * loop over ptep_test_and_clear_young().
> + *
> + * Note that PTE bits in the PTE range besides the PFN can differ. For example,
> + * some PTEs might be write-protected.

Document the return value?

Returns: whether any PTE was young.

Or sth like that.

> + *
> + * Context: The caller holds the page table lock.  The PTEs map consecutive
> + * pages that belong to the same folio.  The PTEs are all in the same PMD.
> + */
> +static inline int test_and_clear_young_ptes(struct vm_area_struct *vma,
> +					    unsigned long addr, pte_t *ptep,
> +					    unsigned int nr)

Two tabs ...

> +{
> +	int young = 0;
> +
> +	for (;;) {
> +		young |= ptep_test_and_clear_young(vma, addr, ptep);
> +		if (--nr == 0)
> +			break;
> +		ptep++;
> +		addr += PAGE_SIZE;
> +	}
> +
> +	return young;

BTW: can this function simply return (and use) a bool instead?

Likely we should do the same for the other functions, but that can be
done separately.

> +}
> +#endif
> +
>  /*
>   * On some architectures hardware does not set page access bit when accessing
>   * memory page, it is responsibility of software setting this bit. It brings
> diff --git a/mm/internal.h b/mm/internal.h
> index 1ba175b8d4f1..1b59be99dc3f 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -1813,16 +1813,23 @@ static inline int pmdp_clear_flush_young_notify(struct vm_area_struct *vma,
>  	return young;
>  }
>  
> -static inline int ptep_clear_young_notify(struct vm_area_struct *vma,
> -					  unsigned long addr, pte_t *ptep)
> +static inline int clear_young_ptes_notify(struct vm_area_struct *vma,
> +					  unsigned long addr, pte_t *ptep,
> +					  unsigned int nr)
>  {
>  	int young;
>  
> -	young = ptep_test_and_clear_young(vma, addr, ptep);
> -	young |= mmu_notifier_clear_young(vma->vm_mm, addr, addr + PAGE_SIZE);
> +	young = test_and_clear_young_ptes(vma, addr, ptep, nr);
> +	young |= mmu_notifier_clear_young(vma->vm_mm, addr, addr + nr * PAGE_SIZE);
>  	return young;
>  }
>  
> +static inline int ptep_clear_young_notify(struct vm_area_struct *vma,
> +					  unsigned long addr, pte_t *ptep)
> +{
> +	return clear_young_ptes_notify(vma, addr, ptep, 1);
> +}
> +
>  static inline int pmdp_clear_young_notify(struct vm_area_struct *vma,
>  					  unsigned long addr, pmd_t *pmdp)
>  {
> @@ -1837,9 +1844,15 @@ static inline int pmdp_clear_young_notify(struct vm_area_struct *vma,
>  
>  #define clear_flush_young_ptes_notify	clear_flush_young_ptes
>  #define pmdp_clear_flush_young_notify	pmdp_clear_flush_young
> -#define ptep_clear_young_notify	ptep_test_and_clear_young
> +#define clear_young_ptes_notify	test_and_clear_young_ptes
>  #define pmdp_clear_young_notify	pmdp_test_and_clear_young
>  
> +static inline int ptep_clear_young_notify(struct vm_area_struct *vma,
> +					  unsigned long addr, pte_t *ptep)
> +{
> +	return test_and_clear_young_ptes(vma, addr, ptep, 1);
> +}

Why not outside of the ifdef a single generic

static inline int ptep_clear_young_notify(struct vm_area_struct *vma,
		 unsigned long addr, pte_t *ptep)
{
	return clear_young_ptes_notify(vma, addr, ptep, 1);
}

Same comment regarding bool.

-- 
Cheers,

David



More information about the linux-arm-kernel mailing list