[PATCH -v2 2/2] arm64, tlbflush: don't TLBI broadcast if page reused in write fault

Barry Song 21cnbao at gmail.com
Wed Oct 22 02:37:54 PDT 2025


>
> With PTL, this becomes
>
> CPU0:                           CPU1:
>
> page fault                      page fault
> lock PTL
> write PTE
> do local tlbi
> unlock PTL
>                                 lock PTL        <- pte visible to CPU 1
>                                 read PTE        <- new PTE
>                                 do local tlbi   <- new PTE
>                                 unlock PTL

I agree. Yet the ish barrier can still avoid the page faults during CPU0's PTL.

CPU0:                                                                  CPU1:

lock PTL

write pte;
Issue ish barrier
do local tlbi;


    No page fault occurs if tlb misses


unlock PTL


Otherwise, it could be:


CPU0:                                                                  CPU1:

lock PTL

write pte;
Issue nsh barrier
do local tlbi;


    page fault occurs if tlb misses


unlock PTL


Not quite sure if adding an ish right after the PTE modification has any
noticeable performance impact on the test? I assume the most expensive part
is still the tlbi broadcast dsb, not the PTE memory sync barrier?

Thanks
Barry



More information about the linux-arm-kernel mailing list