[PATCH v3 11/13] arm64: mm: More flags for __flush_tlb_range()

Jonathan Cameron jonathan.cameron at huawei.com
Tue Mar 3 01:57:18 PST 2026


On Mon, 2 Mar 2026 13:55:58 +0000
Ryan Roberts <ryan.roberts at arm.com> wrote:

> Refactor function variants with "_nosync", "_local" and "_nonotify" into
> a single __always_inline implementation that takes flags and rely on
> constant folding to select the parts that are actually needed at any
> given callsite, based on the provided flags.
> 
> Flags all live in the tlbf_t (TLB flags) type; TLBF_NONE (0) continues
> to provide the strongest semantics (i.e. evict from walk cache,
> broadcast, synchronise and notify). Each flag reduces the strength in
> some way; TLBF_NONOTIFY, TLBF_NOSYNC and TLBF_NOBROADCAST are added to
> complement the existing TLBF_NOWALKCACHE.
> 
> There are no users that require TLBF_NOBROADCAST without
> TLBF_NOWALKCACHE so implement that as BUILD_BUG() to avoid needing to
> introduce dead code for vae1 invalidations.
> 
> The result is a clearer, simpler, more powerful API.
Hi Ryan,

There is one subtle change to rounding that should be called out at least.

Might even be worth pulling it to a precursor patch where you can add an
explanation of why original code was rounding to a larger value than was
ever needed.

Jonathan


> 
> Signed-off-by: Ryan Roberts <ryan.roberts at arm.com>


>  static inline void __flush_tlb_range(struct vm_area_struct *vma,
> @@ -586,24 +615,9 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
>  				     unsigned long stride, int tlb_level,
>  				     tlbf_t flags)
>  {
> -	__flush_tlb_range_nosync(vma->vm_mm, start, end, stride,
> -				 tlb_level, flags);
> -	__tlbi_sync_s1ish();
> -}
> -
> -static inline void local_flush_tlb_contpte(struct vm_area_struct *vma,
> -					   unsigned long addr)
> -{
> -	unsigned long asid;
> -
> -	addr = round_down(addr, CONT_PTE_SIZE);
See below.
> -
> -	dsb(nshst);
> -	asid = ASID(vma->vm_mm);
> -	__flush_s1_tlb_range_op(vale1, addr, CONT_PTES, PAGE_SIZE, asid, 3);
> -	mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, addr,
> -						    addr + CONT_PTE_SIZE);
> -	dsb(nsh);
> +	start = round_down(start, stride);
See below.
> +	end = round_up(end, stride);
> +	__do_flush_tlb_range(vma, start, end, stride, tlb_level, flags);
>  }

>  
>  static inline bool __pte_flags_need_flush(ptdesc_t oldval, ptdesc_t newval)
> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
> index 681f22fac52a1..3f1a3e86353de 100644
> --- a/arch/arm64/mm/contpte.c
> +++ b/arch/arm64/mm/contpte.c
...

> @@ -641,7 +641,10 @@ int contpte_ptep_set_access_flags(struct vm_area_struct *vma,
>  			__ptep_set_access_flags(vma, addr, ptep, entry, 0);
>  
>  		if (dirty)
> -			local_flush_tlb_contpte(vma, start_addr);
> +			__flush_tlb_range(vma, start_addr,
> +					  start_addr + CONT_PTE_SIZE,
> +					  PAGE_SIZE, 3,

This results in a different stride to round down. 
local_flush_tlb_contpte() did
addr = round_down(addr, CONT_PTE_SIZE);

With this call we have
start = round_down(start, stride); where stride is PAGE_SIZE.

I'm too lazy to figure out if that matters.


> +					  TLBF_NOWALKCACHE | TLBF_NOBROADCAST);
>  	} else {
>  		__contpte_try_unfold(vma->vm_mm, addr, ptep, orig_pte);
>  		__ptep_set_access_flags(vma, addr, ptep, entry, dirty);




More information about the linux-arm-kernel mailing list