[PATCH -v2 2/2] arm64, tlbflush: don't TLBI broadcast if page reused in write fault

Wed Oct 22 03:22:02 PDT 2025

On Wed, Oct 22, 2025 at 10:55 PM Barry Song <21cnbao at gmail.com> wrote:
>
> On Wed, Oct 22, 2025 at 10:46 PM Huang, Ying
> <ying.huang at linux.alibaba.com> wrote:
>
> > >
> > > I agree. Yet the ish barrier can still avoid the page faults during CPU0's PTL.
> >
> > IIUC, you think that dsb(ish) compared with dsb(nsh) can accelerate
> > memory writing (visible to other CPUs).  TBH, I suspect that this is the
> > case.
>
> Why? In any case, nsh is not a smp domain.
>
> I believe a dmb(ishst) is sufficient to ensure that the new PTE writes
> are visible
> to other CPUs. I’m not quite sure why the current flush code uses dsb(ish);
> it seems like overkill.

On second thought, the PTE/page table walker might not be a typical
SMP sync case,
so a dmb may not be sufficient—we are not dealing with standard load/store
instruction sequences across multiple threads. In any case, my point is that
dsb(ish) might be slightly slower than your dsb(nsh), but it makes the PTE
visible to other CPUs earlier and helps avoid some page faults after we’ve
written the PTE. However, if your current nsh version actually provides better
performance—even when multiple threads may access the data simultaneously—
It should be completely fine.

Now you are

write pte
don't broadcast pte
tlbi
don't broadcast tlbi

we might be:

write pte
broadcast pte
tlbi
don't broadcast tlbi

Thanks
Barry