[PATCH v2] arm64: Avoid repeated AA64MMFR1_EL1 register read on pagefault path
Anshuman Khandual
anshuman.khandual at arm.com
Wed Jan 11 18:57:02 PST 2023
On 1/11/23 19:01, Gabriel Krisman Bertazi wrote:
> Anshuman Khandual <anshuman.khandual at arm.com> writes:
>
>> On 1/9/23 20:49, Gabriel Krisman Bertazi wrote:
>>> Accessing AA64MMFR1_EL1 is expensive in KVM guests, since it is emulated
>>> in the hypervisor. In fact, ARM documentation mentions some feature
>>> registers are not supposed to be accessed frequently by the OS, and
>>> therefore should be emulated for guests [1].
>>
>> I am just curious, is this the only system register access (AA64MMFR1_EL1)
>> causing such performance problems ?
>
> I haven't audited all the system registers. For AA64MMFR1_EL1 this is
> the only instance where the frequency of access affects performance in a
> meaningful way for my workload.
>
> I have a real-world bug report about it, and by profiling vm exit
> events, I can also argue this is the only instance of any emulated msr
> read/write that happens frequently enough to change the order of
> magnitude of exit events measured by perf for my workload between, at
> least, 5.4 (it was introduced in v5.12, but I have data back to 5.4) and
> mainline.
>
>>> Commit 0388f9c74330 ("arm64: mm: Implement
>>> arch_wants_old_prefaulted_pte()") introduced a read of this register in
>>> the page fault path. But, even when the feature of setting faultaround
>>
>> Right, although cpu_has_hw_af() was added earlier via commit 47d7b15b88f9
>> ("arm64: cpufeature: introduce helper cpu_has_hw_af()"), but above commit
>> did add this on regular page fault path via do_set_pte().
>
> Indeed. The only other usage of this function is in wp_page_copy, and,
> from what I can tell, it is in an unlikely() branch when COW is being
> performed on a page that was recently unmapped. It is not something
> frequent enough that I saw in profiling.
>
>>> ID_AA64MMFR1_EL1_HAFDBS_SHIFT);
>>> }
>>
>> LGTM but as mentioned earlier, are there not other similar instances or this
>> is just more problematic being on direct page fault path ?
>
> I think a full audit of the emulated system registers in kvm will be
> required to definitely answer it. But this instance is, by far, the hottest
> case in the codebase.
>
Thanks for the additional details.
More information about the linux-arm-kernel
mailing list