[PATCH v5 3/3] arm64/mm: Elide tlbi in contpte_convert() under BBML2

Mikołaj Lenczewski miko.lenczewski at arm.com
Thu Apr 3 01:28:01 PDT 2025


On Thu, Apr 03, 2025 at 09:14:43AM +0100, Ryan Roberts wrote:
> On 25/03/2025 05:36, Mikołaj Lenczewski wrote:
> > diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
> > index 55107d27d3f8..77ed03b30b72 100644
> > --- a/arch/arm64/mm/contpte.c
> > +++ b/arch/arm64/mm/contpte.c
> > @@ -68,7 +68,8 @@ static void contpte_convert(struct mm_struct *mm, unsigned long addr,
> >  			pte = pte_mkyoung(pte);
> >  	}
> >  
> > -	__flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3);
> > +	if (!system_supports_bbml2_noabort())
> > +		__flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3);
> >  
> >  	__set_ptes(mm, start_addr, start_ptep, pte, CONT_PTES);
> 
> Despite all the conversation we had about completely eliding the TLBI for the
> BBML2 case, I've continued to be a bit uneasy about it. I had another chat with
> Alex C and we concluded that it is safe, but there could be conceivable
> implementations where it is not performant. Alex suggested doing a TLBI without
> the DSB and I think that's a good idea. So after the __set_ptes(), I suggest adding:
> 
> 	if (system_supports_bbml2_noabort())
> 		__flush_tlb_range_nosync(mm, start_addr, addr, PAGE_SIZE,
> 					 true, 3);
> 
> That will issue the TLBI but won't wait for it to complete. So it should be very
> fast. We are guranteed correctness immediately. We are guranteed performance
> after the next DSB (worst-case; next context switch).
> 
> Thanks,
> Ryan

Hi Ryan,

Sure, perfectly happy to add that on. Will respin and add a note about
this behaviour to the source code and to the patch / cover letter.

-- 
Kind regards,
Mikołaj Lenczewski



More information about the linux-arm-kernel mailing list