[PATCH -v2 2/2] arm64, tlbflush: don't TLBI broadcast if page reused in write fault
Huang, Ying
ying.huang at linux.alibaba.com
Wed Oct 22 02:46:48 PDT 2025
Barry Song <21cnbao at gmail.com> writes:
>>
>> With PTL, this becomes
>>
>> CPU0: CPU1:
>>
>> page fault page fault
>> lock PTL
>> write PTE
>> do local tlbi
>> unlock PTL
>> lock PTL <- pte visible to CPU 1
>> read PTE <- new PTE
>> do local tlbi <- new PTE
>> unlock PTL
>
> I agree. Yet the ish barrier can still avoid the page faults during CPU0's PTL.
IIUC, you think that dsb(ish) compared with dsb(nsh) can accelerate
memory writing (visible to other CPUs). TBH, I suspect that this is the
case.
> CPU0: CPU1:
>
> lock PTL
>
> write pte;
> Issue ish barrier
> do local tlbi;
>
>
> No page fault occurs if tlb misses
>
>
> unlock PTL
>
>
> Otherwise, it could be:
>
>
> CPU0: CPU1:
>
> lock PTL
>
> write pte;
> Issue nsh barrier
> do local tlbi;
>
>
> page fault occurs if tlb misses
>
>
> unlock PTL
>
>
> Not quite sure if adding an ish right after the PTE modification has any
> noticeable performance impact on the test? I assume the most expensive part
> is still the tlbi broadcast dsb, not the PTE memory sync barrier?
---
Best Regards,
Huang, Ying
More information about the linux-arm-kernel
mailing list