[bug report] KVM: arm64: vgic-v4: Occasionally issue VMOVP to an unmapped VPE on GICv4.1
Kunkun Jiang
jiangkunkun at huawei.com
Sun Sep 29 00:18:41 PDT 2024
Hi all,
I found a problem with occasionally issuing VMOVP to an unmapped VPE on
GICv4.1. In my test environment, operating an unmapped VPE will generate
RAS, so I found this problem. The detailed analysis is as follows.
The vgic_v4_teardown() will be executed when VM is destroyed to free the
GICv4 data structures. The code is as follows:
> /**
> * vgic_v4_teardown - Free the GICv4 data structures
> * @kvm: Pointer to the VM being destroyed
> */
> void vgic_v4_teardown(struct kvm *kvm)
> {
> struct its_vm *its_vm = &kvm->arch.vgic.its_vm;
> int i;
>
> lockdep_assert_held(&kvm->arch.config_lock);
>
> if (!its_vm->vpes)
> return;
>
> for (i = 0; i < its_vm->nr_vpes; i++) {
> struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, i);
> int irq = its_vm->vpes[i]->irq;
>
> irq_clear_status_flags(irq, DB_IRQ_FLAGS);
> free_irq(irq, vcpu);
> }
>
> its_free_vcpu_irqs(its_vm);
> kfree(its_vm->vpes);
> its_vm->nr_vpes = 0;
> its_vm->vpes = NULL;
> }
[1] In irq_clear_status_flags(irq, DB_IRQ_FLAGS), the status flags of a
doorbell are cleared. DB_IRQ_FLAGS contains IRQ_NO_BALANCING. So after
this,the irqbalance.service can schedule the doorbell.
[2] In free_irq(), the VPE is unmaped.
[3] In its_free_vcpu_irqs(its_vm), unregister_irq_proc() is called to
delete the contents in /proc/irq/xx/ of the doorbell.
For VPEs in large-scale VM, there is a centain time window between [2]
and [3]. The irqbalance.service got a chance to schedule the doorbell.
Therefore, the VMOVP is issued to an unmapped VPE.
I tried not clearing IRQ_NO_BALANCING and the problem was solved. But
it's not clear if there's any other problem with doing so.
Thanks,
Kunkun Jiang
More information about the linux-arm-kernel
mailing list