[PATCH -v2 2/2] arm64, tlbflush: don't TLBI broadcast if page reused in write fault

Huang, Ying ying.huang at linux.alibaba.com
Wed Oct 22 23:15:53 PDT 2025


Barry Song <21cnbao at gmail.com> writes:

>> >
>> > A:
>> > write pte
>> > don't broadcast pte
>> > tlbi
>> > don't broadcast tlbi
>> >
>> > with
>> >
>> > B:
>> > write pte
>> > broadcast pte
>>
>> I suspect that pte will be broadcast, DVM broadcast isn't used for
>> the memory coherency IIUC.
>
> I guess you’re right. By “broadcast,” I actually meant the PTE becoming visible
> to other CPUs. With a dsb(ish) before tlbi, other cores’ TLBs can load the new
> PTE after their TLB is shoot down. But as you said, if the hardware doesn’t
> propagate the updated PTE faster, it doesn’t seem to help reduce page faults.
>
> As a side note, I’m curious about the data between dsb(nsh) and dsb(ish) on
> your platform. Perhaps because the number of CPU cores is small, I didn’t see
> any noticeable difference between them on phones.

Sure.  I can git it a try.  Can you share the test case?

>>
>> > tlbi
>> > don't broadcast tlbi
>> >
>> > I guess the gain comes from "don't broadcat tlbi" ?
>> > With B, we should be able to share many existing code.
>>
>> Ryan has some plan to reduce the code duplication with the current
>> solution.
>
> Ok.

---
Best Regards,
Huang, Ying



More information about the linux-arm-kernel mailing list