[PATCH] kvmtool: don't use PCI config space IRQ line field
Will Deacon
will.deacon at arm.com
Tue Jun 16 10:06:57 PDT 2015
On Mon, Jun 15, 2015 at 11:45:38AM +0100, Andre Przywara wrote:
> On 06/05/2015 05:41 PM, Will Deacon wrote:
> > On Thu, Jun 04, 2015 at 04:20:45PM +0100, Andre Przywara wrote:
> >> In PCI config space there is an interrupt line field (offset 0x3f),
> >> which is used to initially communicate the IRQ line number from
> >> firmware to the OS. _Hardware_ should never use this information,
> >> as the OS is free to write any information in there.
> >> But kvmtool uses this number when it triggers IRQs in the guest,
> >> which fails starting with Linux 3.19-rc1, where the PCI layer starts
> >> writing the virtual IRQ number in there.
> >>
> >> Fix that by storing the IRQ number in a separate field in
> >> struct virtio_pci, which is independent from the PCI config space
> >> and cannot be influenced by the guest.
> >> This fixes ARM/ARM64 guests using PCI with newer kernels.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara at arm.com>
> >> ---
> >> include/kvm/virtio-pci.h | 8 ++++++++
> >> virtio/pci.c | 9 ++++++---
> >> 2 files changed, 14 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/include/kvm/virtio-pci.h b/include/kvm/virtio-pci.h
> >> index c795ce7..b70cadd 100644
> >> --- a/include/kvm/virtio-pci.h
> >> +++ b/include/kvm/virtio-pci.h
> >> @@ -30,6 +30,14 @@ struct virtio_pci {
> >> u8 isr;
> >> u32 features;
> >>
> >> + /*
> >> + * We cannot rely on the INTERRUPT_LINE byte in the config space once
> >> + * we have run guest code, as the OS is allowed to use that field
> >> + * as a scratch pad to communicate between driver and PCI layer.
> >> + * So store our legacy interrupt line number in here for internal use.
> >> + */
> >> + u8 legacy_irq_line;
> >> +
> >> /* MSI-X */
> >> u16 config_vector;
> >> u32 config_gsi;
> >> diff --git a/virtio/pci.c b/virtio/pci.c
> >> index 7556239..e17e5a9 100644
> >> --- a/virtio/pci.c
> >> +++ b/virtio/pci.c
> >> @@ -141,7 +141,7 @@ static bool virtio_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 p
> >> break;
> >> case VIRTIO_PCI_ISR:
> >> ioport__write8(data, vpci->isr);
> >> - kvm__irq_line(kvm, vpci->pci_hdr.irq_line, VIRTIO_IRQ_LOW);
> >> + kvm__irq_line(kvm, vpci->legacy_irq_line, VIRTIO_IRQ_LOW);
> >> vpci->isr = VIRTIO_IRQ_LOW;
> >> break;
> >> default:
> >> @@ -299,7 +299,7 @@ int virtio_pci__signal_vq(struct kvm *kvm, struct virtio_device *vdev, u32 vq)
> >> kvm__irq_trigger(kvm, vpci->gsis[vq]);
> >> } else {
> >> vpci->isr = VIRTIO_IRQ_HIGH;
> >> - kvm__irq_trigger(kvm, vpci->pci_hdr.irq_line);
> >> + kvm__irq_trigger(kvm, vpci->legacy_irq_line);
> >> }
> >> return 0;
> >> }
> >> @@ -323,7 +323,7 @@ int virtio_pci__signal_config(struct kvm *kvm, struct virtio_device *vdev)
> >> kvm__irq_trigger(kvm, vpci->config_gsi);
> >> } else {
> >> vpci->isr = VIRTIO_PCI_ISR_CONFIG;
> >> - kvm__irq_trigger(kvm, vpci->pci_hdr.irq_line);
> >> + kvm__irq_trigger(kvm, vpci->legacy_irq_line);
> >> }
> >>
> >> return 0;
> >> @@ -422,6 +422,9 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
> >> if (r < 0)
> >> goto free_msix_mmio;
> >>
> >> + /* save the IRQ that device__register() has allocated */
> >> + vpci->legacy_irq_line = vpci->pci_hdr.irq_line;
> >
> > I'd rather we used the container_of trick that we do for virtio-mmio
> > devices when assigning the irq in device__register. Then we can avoid
> > this line completely.
>
> Not completely sure I get what you mean, I take it you want to assign
> legacy_irq_line in pci__assign_irq() directly (where the IRQ number is
> allocated).
> But this function is PCI generic code and is used by the VESA
> framebuffer and the shmem device on x86 as well. For those devices
> dev_hdr is not part of a struct virtio_pci, so we can't do container_of
> to assign the legacy_irq_line here directly.
> Admittedly this fix should apply to the other two users as well, but
> VESA does not use interrupts and pci-shmem is completely broken anyway,
> so I didn't bother to fix it in this regard.
> Would it be justified to provide an IRQ number field in struct
> device_header to address all users?
>
> Or what am I missing here?
If VESA and shmem are broken, they should either be fixed or removed.
If you fix them, then we could have separate virtual buses for virtio-pci
and emulated-pci (or whatever you want to call it). We could also have
a separate bus for passthrough-devices too.
However, that's quite a lot of work for a bug-fix, so I guess the easiest
thing is to extend your current hack to cover VESA and shmem too.
Will
More information about the linux-arm-kernel
mailing list