[PATCH v8 0/7] KVM: x86: Add idempotent controls for migrating system counter state
Paolo Bonzini
pbonzini at redhat.com
Fri Sep 24 09:43:18 PDT 2021
On 16/09/21 20:15, Oliver Upton wrote:
> KVM's current means of saving/restoring system counters is plagued with
> temporal issues. On x86, we migrate the guest's system counter by-value
> through the respective guest's IA32_TSC value. Restoring system counters
> by-value is brittle as the state is not idempotent: the host system
> counter is still oscillating between the attempted save and restore.
> Furthermore, VMMs may wish to transparently live migrate guest VMs,
> meaning that they include the elapsed time due to live migration blackout
> in the guest system counter view. The VMM thread could be preempted for
> any number of reasons (scheduler, L0 hypervisor under nested) between the
> time that it calculates the desired guest counter value and when
> KVM actually sets this counter state.
>
> Despite the value-based interface that we present to userspace, KVM
> actually has idempotent guest controls by way of the TSC offset.
> We can avoid all of the issues associated with a value-based interface
> by abstracting these offset controls in a new device attribute. This
> series introduces new vCPU device attributes to provide userspace access
> to the vCPU's system counter offset.
>
> Patches 1-2 are Paolo's refactorings around locking and the
> KVM_{GET,SET}_CLOCK ioctls.
>
> Patch 3 cures a race where use_master_clock is read outside of the
> pvclock lock in the KVM_GET_CLOCK ioctl.
>
> Patch 4 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
> ioctls to provide userspace with a (host_tsc, realtime) instant. This is
> essential for a VMM to perform precise migration of the guest's system
> counters.
>
> Patch 5 does away with the pvclock spin lock in favor of a sequence
> lock based on the tsc_write_lock. The original patch is from Paolo, I
> touched it up a bit to fix a deadlock and some unused variables that
> caused -Werror to scream.
>
> Patch 6 extracts the TSC synchronization tracking code in a way that it
> can be used for both offset-based and value-based TSC synchronization
> schemes.
>
> Finally, patch 7 implements a vCPU device attribute which allows VMMs to
> get at the TSC offset of a vCPU.
>
> This series was tested with the new KVM selftests for the KVM clock and
> system counter offset controls on Haswell hardware. Kernel was built
> with CONFIG_LOCKDEP given the new locking changes/lockdep assertions
> here.
>
> Note that these tests are mailed as a separate series due to the
> dependencies in both x86 and arm64.
>
> Applies cleanly to 5.15-rc1
>
> v8: http://lore.kernel.org/r/20210816001130.3059564-1-oupton@google.com
>
> v7 -> v8:
> - Rebased to 5.15-rc1
> - Picked up Paolo's version of the series, which includes locking
> changes
> - Make KVM advertise KVM_CAP_VCPU_ATTRIBUTES
>
> Oliver Upton (4):
> KVM: x86: Fix potential race in KVM_GET_CLOCK
> KVM: x86: Report host tsc and realtime values in KVM_GET_CLOCK
> KVM: x86: Refactor tsc synchronization code
> KVM: x86: Expose TSC offset controls to userspace
>
> Paolo Bonzini (3):
> kvm: x86: abstract locking around pvclock_update_vm_gtod_copy
> KVM: x86: extract KVM_GET_CLOCK/KVM_SET_CLOCK to separate functions
> kvm: x86: protect masterclock with a seqcount
>
> Documentation/virt/kvm/api.rst | 42 ++-
> Documentation/virt/kvm/devices/vcpu.rst | 57 +++
> arch/x86/include/asm/kvm_host.h | 12 +-
> arch/x86/include/uapi/asm/kvm.h | 4 +
> arch/x86/kvm/x86.c | 458 ++++++++++++++++--------
> include/uapi/linux/kvm.h | 7 +-
> 6 files changed, 419 insertions(+), 161 deletions(-)
>
Queued, thanks.
Paolo
More information about the linux-arm-kernel
mailing list