[PATCH] arm64: kvm: Expose timer offset directly via KVM_{GET,SET}_ONE_REG

Simon Veith sveith at amazon.de
Thu Feb 2 04:13:14 PST 2023


The virtual timer count register (CNTVCT_EL0) is virtualized by
configuring offset register CNTVOFF_EL2 to subtract from the underlying
raw hardware timer count when the guest reads the current count.

Currently, we offer userspace the ability to serialize and deserialize
only the absolute count register value, using KVM_{GET,SET}_ONE_REG with
KVM_REG_ARM_TIMER_CNT. Internally, we then compute and set the offset
register accordingly to obtain the requested count value.

Allowing to set this timer count register only by absolute value poses
some problems to virtual machine monitors that try to maintain the
illusion of continuously ticking clocks to the guest: In workflows like
live migration or liveupdate, the timers must be increased artificially
to account for pause time.

Any delays between userspace computing the correct timer count value and
actually setting it in kernel space by KVM_SET_ONE_REG (such as can be
incurred by scheduling) become visible as under-accounted pause time in
the guest, meaning the guest observes that its system clock seems to
have fallen behind its NTP time reference.

The issue is further complicated when vCPU setup is performed by
independent threads which may experience different delays, leading to
jitter between the clocks of different vCPUs.

We could deliver a more stable timer in such scenarios if we allowed
userspace to set the offset with regards to the physical counter
directly.

Expose the KVM_REG_ARM_TIMER_OFF register directly to userspace, as an
alternative view of the timer counts. By default, userspace still sees
only the existing KVM_REG_ARM_TIMER_CNT register when querying the list
with KVM_GET_REG_LIST, as that register value is portable across
different VM hosts and thus safe to persist.

Signed-off-by: Simon Veith <sveith at amazon.de>

CC: dwmw2 at infradead.org
CC: Catalin Marinas <catalin.marinas at arm.com>
CC: Will Deacon <will at kernel.org>
CC: Marc Zyngier <maz at kernel.org>
CC: James Morse <james.morse at arm.com>
CC: Suzuki K Poulose <suzuki.poulose at arm.com>
CC: Oliver Upton <oliver.upton at linux.dev>
CC: Zenghui Yu <yuzenghui at huawei.com>
CC: linux-arm-kernel at lists.infradead.org
---
 arch/arm64/include/uapi/asm/kvm.h |  1 +
 arch/arm64/kvm/arch_timer.c       | 10 ++++++++++
 arch/arm64/kvm/guest.c            |  9 +++++++++
 include/kvm/arm_arch_timer.h      |  1 +
 4 files changed, 21 insertions(+)

diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index a7a857f1784d..077699e403ab 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -260,6 +260,7 @@ struct kvm_arm_copy_mte_tags {
 #define KVM_REG_ARM_TIMER_CTL		ARM64_SYS_REG(3, 3, 14, 3, 1)
 #define KVM_REG_ARM_TIMER_CVAL		ARM64_SYS_REG(3, 3, 14, 0, 2)
 #define KVM_REG_ARM_TIMER_CNT		ARM64_SYS_REG(3, 3, 14, 3, 2)
+#define KVM_REG_ARM_TIMER_OFF		ARM64_SYS_REG(3, 4, 14, 0, 3)
 
 /* KVM-as-firmware specific pseudo-registers */
 #define KVM_REG_ARM_FW			(0x0014 << KVM_REG_ARM_COPROC_SHIFT)
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index bb24a76b4224..f68b9edbea6b 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -830,6 +830,9 @@ int kvm_arm_timer_set_reg(struct kvm_vcpu *vcpu, u64 regid, u64 value)
 		timer = vcpu_vtimer(vcpu);
 		update_vtimer_cntvoff(vcpu, kvm_phys_timer_read() - value);
 		break;
+	case KVM_REG_ARM_TIMER_OFF:
+		update_vtimer_cntvoff(vcpu, value);
+		break;
 	case KVM_REG_ARM_TIMER_CVAL:
 		timer = vcpu_vtimer(vcpu);
 		kvm_arm_timer_write(vcpu, timer, TIMER_REG_CVAL, value);
@@ -875,6 +878,9 @@ u64 kvm_arm_timer_get_reg(struct kvm_vcpu *vcpu, u64 regid)
 	case KVM_REG_ARM_TIMER_CNT:
 		return kvm_arm_timer_read(vcpu,
 					  vcpu_vtimer(vcpu), TIMER_REG_CNT);
+	case KVM_REG_ARM_TIMER_OFF:
+		return kvm_arm_timer_read(vcpu,
+					  vcpu_vtimer(vcpu), TIMER_REG_OFF);
 	case KVM_REG_ARM_TIMER_CVAL:
 		return kvm_arm_timer_read(vcpu,
 					  vcpu_vtimer(vcpu), TIMER_REG_CVAL);
@@ -915,6 +921,10 @@ static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
 		val = kvm_phys_timer_read() - timer_get_offset(timer);
 		break;
 
+	case TIMER_REG_OFF:
+		val = timer_get_offset(timer);
+		break;
+
 	default:
 		BUG();
 	}
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index cf4c495a4321..a934189e1811 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -586,6 +586,10 @@ static unsigned long num_core_regs(const struct kvm_vcpu *vcpu)
 
 /**
  * ARM64 versions of the TIMER registers, always available on arm64
+ *
+ * Note that the _OFF register is another view of the _CNT register, and is
+ * therefore not counted separately in NUM_TIMER_REGS, nor included in
+ * KVM_GET_REG_LIST.
  */
 
 #define NUM_TIMER_REGS 3
@@ -595,6 +599,7 @@ static bool is_timer_reg(u64 index)
 	switch (index) {
 	case KVM_REG_ARM_TIMER_CTL:
 	case KVM_REG_ARM_TIMER_CNT:
+	case KVM_REG_ARM_TIMER_OFF:
 	case KVM_REG_ARM_TIMER_CVAL:
 		return true;
 	}
@@ -609,6 +614,10 @@ static int copy_timer_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 	if (put_user(KVM_REG_ARM_TIMER_CNT, uindices))
 		return -EFAULT;
 	uindices++;
+	/*
+	 * KVM_REG_ARM_TIMER_OFF is another view of KVM_REG_ARM_TIMER_CNT and
+	 * therefore not included in the register list.
+	 */
 	if (put_user(KVM_REG_ARM_TIMER_CVAL, uindices))
 		return -EFAULT;
 
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index cd6d8f260eab..66de7aa2018e 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -21,6 +21,7 @@ enum kvm_arch_timer_regs {
 	TIMER_REG_CVAL,
 	TIMER_REG_TVAL,
 	TIMER_REG_CTL,
+	TIMER_REG_OFF,
 };
 
 struct arch_timer_context {
-- 
2.34.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879






More information about the linux-arm-kernel mailing list