[PATCH v8 0/7] KVM: x86: Add idempotent controls for migrating system counter state

Paolo Bonzini pbonzini at redhat.com
Fri Sep 24 09:43:18 PDT 2021


On 16/09/21 20:15, Oliver Upton wrote:
> KVM's current means of saving/restoring system counters is plagued with
> temporal issues. On x86, we migrate the guest's system counter by-value
> through the respective guest's IA32_TSC value. Restoring system counters
> by-value is brittle as the state is not idempotent: the host system
> counter is still oscillating between the attempted save and restore.
> Furthermore, VMMs may wish to transparently live migrate guest VMs,
> meaning that they include the elapsed time due to live migration blackout
> in the guest system counter view. The VMM thread could be preempted for
> any number of reasons (scheduler, L0 hypervisor under nested) between the
> time that it calculates the desired guest counter value and when
> KVM actually sets this counter state.
> 
> Despite the value-based interface that we present to userspace, KVM
> actually has idempotent guest controls by way of the TSC offset.
> We can avoid all of the issues associated with a value-based interface
> by abstracting these offset controls in a new device attribute. This
> series introduces new vCPU device attributes to provide userspace access
> to the vCPU's system counter offset.
> 
> Patches 1-2 are Paolo's refactorings around locking and the
> KVM_{GET,SET}_CLOCK ioctls.
> 
> Patch 3 cures a race where use_master_clock is read outside of the
> pvclock lock in the KVM_GET_CLOCK ioctl.
> 
> Patch 4 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
> ioctls to provide userspace with a (host_tsc, realtime) instant. This is
> essential for a VMM to perform precise migration of the guest's system
> counters.
> 
> Patch 5 does away with the pvclock spin lock in favor of a sequence
> lock based on the tsc_write_lock. The original patch is from Paolo, I
> touched it up a bit to fix a deadlock and some unused variables that
> caused -Werror to scream.
> 
> Patch 6 extracts the TSC synchronization tracking code in a way that it
> can be used for both offset-based and value-based TSC synchronization
> schemes.
> 
> Finally, patch 7 implements a vCPU device attribute which allows VMMs to
> get at the TSC offset of a vCPU.
> 
> This series was tested with the new KVM selftests for the KVM clock and
> system counter offset controls on Haswell hardware. Kernel was built
> with CONFIG_LOCKDEP given the new locking changes/lockdep assertions
> here.
> 
> Note that these tests are mailed as a separate series due to the
> dependencies in both x86 and arm64.
> 
> Applies cleanly to 5.15-rc1
> 
> v8: http://lore.kernel.org/r/20210816001130.3059564-1-oupton@google.com
> 
> v7 -> v8:
>   - Rebased to 5.15-rc1
>   - Picked up Paolo's version of the series, which includes locking
>     changes
>   - Make KVM advertise KVM_CAP_VCPU_ATTRIBUTES
> 
> Oliver Upton (4):
>    KVM: x86: Fix potential race in KVM_GET_CLOCK
>    KVM: x86: Report host tsc and realtime values in KVM_GET_CLOCK
>    KVM: x86: Refactor tsc synchronization code
>    KVM: x86: Expose TSC offset controls to userspace
> 
> Paolo Bonzini (3):
>    kvm: x86: abstract locking around pvclock_update_vm_gtod_copy
>    KVM: x86: extract KVM_GET_CLOCK/KVM_SET_CLOCK to separate functions
>    kvm: x86: protect masterclock with a seqcount
> 
>   Documentation/virt/kvm/api.rst          |  42 ++-
>   Documentation/virt/kvm/devices/vcpu.rst |  57 +++
>   arch/x86/include/asm/kvm_host.h         |  12 +-
>   arch/x86/include/uapi/asm/kvm.h         |   4 +
>   arch/x86/kvm/x86.c                      | 458 ++++++++++++++++--------
>   include/uapi/linux/kvm.h                |   7 +-
>   6 files changed, 419 insertions(+), 161 deletions(-)
> 

Queued, thanks.

Paolo




More information about the linux-arm-kernel mailing list