[RFC PATCH] KVM: arm64: vgic: Skip vCPU trylock for pre-init register access
Yao Yuan
yaoyuan at linux.alibaba.com
Mon May 11 04:03:25 PDT 2026
On Sun, May 10, 2026 at 11:11:43PM +0800, David Woodhouse wrote:
> From: David Woodhouse <dwmw at amazon.co.uk>
>
> When creating a VM, userspace sets pre-init distributor registers such
> as GICD_IIDR before the VGIC is initialized.
>
> Strictly speaking there's no reason this couldn't be done at VM creation
> time before any vCPUs exist at all, but the design of the userspace API
> does require vCPU0 to have been created, as the system-wide registers
> are effectively accessed via vCPU0.
>
> So a VMM can't just set the IIDR at startup before spawning vCPU threads;
> it has to be done once the vCPUs are being created.
>
> However, kvm_trylock_all_vcpus() makes it unreliable by causing spurious
> -EBUSY failures if any vCPU cannot be locked. So userspace is forced to
> create the vCPUs (well, at least vCPU0), and is then forced to have a
> full synchronization point and quiesce them all in order to reliably set
> the IIDR.
>
> To slightly reduce the pain of all this, skip the trylock when the VGIC
> is not yet initialized. There is no need to lock the vCPUs if they can't
> be accessing it anyway.
>
> Signed-off-by: David Woodhouse <dwmw at amazon.co.uk>
> ---
> Other options would include making it possible to set the IIDR before
> creating any vCPUs, either by creating a new device-level attribute or
> special-casing it not to require vCPU0 for DIST_REGS that aren't really
> associated to a vCPU.
>
> Deprecating kvm_trylock_all_vcpus() and killing every user of it with
> fire would also be a good option, perhaps... :)
>
> arch/arm64/kvm/vgic/vgic-kvm-device.c | 38 ++++++++++++++++++++-------
> 1 file changed, 29 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/kvm/vgic/vgic-kvm-device.c b/arch/arm64/kvm/vgic/vgic-kvm-device.c
> index a96c77dccf35..e17ea9f07f5f 100644
> --- a/arch/arm64/kvm/vgic/vgic-kvm-device.c
> +++ b/arch/arm64/kvm/vgic/vgic-kvm-device.c
> @@ -540,7 +540,7 @@ static int vgic_v3_attr_regs_access(struct kvm_device *dev,
> struct vgic_reg_attr reg_attr;
> gpa_t addr;
> struct kvm_vcpu *vcpu;
> - bool uaccess;
> + bool uaccess, vcpus_locked = false;
> u32 val;
> int ret;
>
> @@ -566,18 +566,37 @@ static int vgic_v3_attr_regs_access(struct kvm_device *dev,
> return -EFAULT;
> }
>
> + if (!vgic_initialized(dev->kvm) && !reg_allowed_pre_init(attr))
> + return -EBUSY;
> +
> mutex_lock(&dev->kvm->lock);
>
> - if (kvm_trylock_all_vcpus(dev->kvm)) {
> - mutex_unlock(&dev->kvm->lock);
> - return -EBUSY;
> + /*
> + * Pre-init registers (e.g. GICD_IIDR) don't need vCPU quiescence
> + * since the VGIC isn't live yet. Skip the trylock to avoid spurious
> + * -EBUSY when vCPU threads happen to be running.
> + */
> + if (vgic_initialized(dev->kvm)) {
> + if (kvm_trylock_all_vcpus(dev->kvm)) {
> + mutex_unlock(&dev->kvm->lock);
> + return -EBUSY;
> + }
> + vcpus_locked = true;
> }
> -
> mutex_lock(&dev->kvm->arch.config_lock);
>
> - if (!(vgic_initialized(dev->kvm) || reg_allowed_pre_init(attr))) {
> - ret = -EBUSY;
> - goto out;
Hi David,
> + /*
> + * If the VGIC becomes initialized between the above check and taking
> + * config_lock, drop config_lock to lock the VCPUS.
> + */
> + if (vgic_initialized(dev->kvm) && !vcpus_locked) {
> + mutex_unlock(&dev->kvm->arch.config_lock);
> + if (kvm_trylock_all_vcpus(dev->kvm)) {
> + mutex_unlock(&dev->kvm->lock);
> + return -EBUSY;
> + }
> + mutex_lock(&dev->kvm->arch.config_lock);
> + vcpus_locked = true;
Question: Why not do vgic_initialized(dev->kvm) and
reg_allowed_pre_init(attr) checking after take config_lock,
and depends on the checking result to decide trylocak all
vcpus or not ?
> }
>
> switch (attr->group) {
> @@ -612,7 +631,8 @@ static int vgic_v3_attr_regs_access(struct kvm_device *dev,
>
> out:
> mutex_unlock(&dev->kvm->arch.config_lock);
> - kvm_unlock_all_vcpus(dev->kvm);
> + if (vcpus_locked)
> + kvm_unlock_all_vcpus(dev->kvm);
> mutex_unlock(&dev->kvm->lock);
>
> if (!ret && uaccess && !is_write) {
> --
> 2.43.0
>
>
More information about the linux-arm-kernel
mailing list