[RFC PATCH] arm64: Cache HW update of Access Flag support

Gabriel Krisman Bertazi krisman at suse.de
Fri Jan 6 08:38:25 PST 2023


Accessing AA64MMFR1_EL1 is expensive in KVM guests, since it is emulated
in the hypervisor.  In fact, ARM documentation mentions some feature
registers are not supposed to be accessed frequently by the OS, and
therefore should be emulated for guests [1].

Commit 0388f9c74330 ("arm64: mm: Implement
arch_wants_old_prefaulted_pte()") introduced a read of this register in
the page fault path.  But, even when the feature of setting faultaround
pages with the old flag is disabled for a given cpu, we are still paying
the cost of checking the register on every pagefault. This results in an
explosion of vmexit events in KVM guests, which directly impacts the
performance of virtualized workloads.  For instance, running kernbench
yields a 15% increase in system time solely due to the increased vmexit
cycles.

This patch avoids the extra cost by caching the register value after the
first read.  It should be safe to do so, since this register mustn't
change for a specific processor.

As a side note, we can't rely on the usual cpucap for this bit, because
for some reason that I'm missing, it is always enabled for aarch64, even if
the functionality is not supported by the processor.

[1] https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/Learn%20the%20Architecture/Armv8-A%20virtualization.pdf?revision=a765a7df-1a00-434d-b241-357bfda2dd31

Signed-off-by: Gabriel Krisman Bertazi <krisman at suse.de>
---
 arch/arm64/include/asm/cpufeature.h | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 03d1c9d7af82..599c7a22155b 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -859,14 +859,24 @@ static inline u32 id_aa64mmfr0_parange_to_phys_shift(int parange)
 /* Check whether hardware update of the Access flag is supported */
 static inline bool cpu_has_hw_af(void)
 {
-	u64 mmfr1;
+	static unsigned int has_af = -1U;
 
 	if (!IS_ENABLED(CONFIG_ARM64_HW_AFDBM))
 		return false;
 
-	mmfr1 = read_cpuid(ID_AA64MMFR1_EL1);
-	return cpuid_feature_extract_unsigned_field(mmfr1,
+	/*
+	 * Cache the register to avoid repeatedly reading it,
+	 * which is an emulated operation on KVM guests.
+	 *
+	 * Also do not rely on cpucap, since this bit is always set
+	 * there, regardless of actual status. See has_hw_dbm() for
+	 * details.
+	 */
+	if (unlikely(has_af == -1U))
+		has_af = cpuid_feature_extract_unsigned_field(
+						read_cpuid(ID_AA64MMFR1_EL1),
 						ID_AA64MMFR1_EL1_HAFDBS_SHIFT);
+	return has_af;
 }
 
 static inline bool cpu_has_pan(void)
-- 
2.35.3




More information about the linux-arm-kernel mailing list