[PATCH v2 0/4] VFIO platform reset

Scott Wood scottwood at freescale.com
Fri Jun 5 14:14:52 PDT 2015


On Fri, 2015-06-05 at 13:05 -0500, Rob Herring wrote:
> On Fri, Jun 5, 2015 at 10:06 AM, Eric Auger <eric.auger at linaro.org> 
> wrote:
> > In situations where the userspace driver is stopped abnormally and 
> > the
> > VFIO platform device is released, the assigned HW device currently 
> > is
> > left running. As a consequence the HW device might continue 
> > issuing IRQs
> > and performing DMA accesses.
> > 
> > On release, no physical IRQ handler is setup anymore. Also the DMA 
> > buffers
> > are unmapped leading to IOMMU aborts. So there is no serious 
> > consequence.
> > 
> > However when assigning that HW device again to another userspace 
> > driver,
> > this latter might face some unexpected IRQs and DMA accesses, 
> > which are
> > the result of the previous assignment.
> 
> In general, shouldn't it just be a requirement that the drivers 
> handle
> this condition. You have the same problem with firmware/bootloaders
> leaving h/w not in reset state or kexec'ing to a new kernel.

It's not the same situation.  Firmware may leave HW in a non-reset 
state but it must not leave the HW doing DMA; there's nothing the OS 
could do about that as the OS could get corrupted before the driver 
has a chance to run (this is not fun to debug).  Leaving interrupts 
potentially asserted would be bad as well, especially if the interrupt 
is shared.

Likewise, with normal kexec drivers are supposed to quiesce the 
hardware first -- and with kdump, the affected DMA buffers are never 
reused.

In order for the driver to handle this, it would need to reset/quiesce 
the device itself before enabling an IOMMU mapping.  How would that 
work for virtualization scenarios where the guest does not see any 
IOMMU, and all vfio mappings are handled by QEMU or equivalent?

-Scott




More information about the linux-arm-kernel mailing list