[PATCH v2] arm64: Avoid repeated AA64MMFR1_EL1 register read on pagefault path

Gabriel Krisman Bertazi krisman at suse.de
Wed Jan 11 05:31:21 PST 2023


Anshuman Khandual <anshuman.khandual at arm.com> writes:

> On 1/9/23 20:49, Gabriel Krisman Bertazi wrote:
>> Accessing AA64MMFR1_EL1 is expensive in KVM guests, since it is emulated
>> in the hypervisor.  In fact, ARM documentation mentions some feature
>> registers are not supposed to be accessed frequently by the OS, and
>> therefore should be emulated for guests [1].
>
> I am just curious, is this the only system register access (AA64MMFR1_EL1)
> causing such performance problems ?

I haven't audited all the system registers.  For AA64MMFR1_EL1 this is
the only instance where the frequency of access affects performance in a
meaningful way for my workload.

I have a real-world bug report about it, and by profiling vm exit
events, I can also argue this is the only instance of any emulated msr
read/write that happens frequently enough to change the order of
magnitude of exit events measured by perf for my workload between, at
least, 5.4 (it was introduced in v5.12, but I have data back to 5.4) and
mainline.

>> Commit 0388f9c74330 ("arm64: mm: Implement
>> arch_wants_old_prefaulted_pte()") introduced a read of this register in
>> the page fault path.  But, even when the feature of setting faultaround
>
> Right, although cpu_has_hw_af() was added earlier via commit 47d7b15b88f9
> ("arm64: cpufeature: introduce helper cpu_has_hw_af()"), but above commit
> did add this on regular page fault path via do_set_pte().

Indeed.  The only other usage of this function is in wp_page_copy, and,
from what I can tell, it is in an unlikely() branch when COW is being
performed on a page that was recently unmapped.  It is not something
frequent enough that I saw in profiling.

>>  						ID_AA64MMFR1_EL1_HAFDBS_SHIFT);
>>  }
>
> LGTM but as mentioned earlier, are there not other similar instances or this
> is just more problematic being on direct page fault path ?

I think a full audit of the emulated system registers in kvm will be
required to definitely answer it.  But this instance is, by far, the hottest
case in the codebase.

-- 
Gabriel Krisman Bertazi



More information about the linux-arm-kernel mailing list