[PATCHv3 1/3] ARM: mm: allow sub-architectures to override PCI I/O memory type
Arnd Bergmann
arnd at arndb.de
Wed May 21 01:20:57 PDT 2014
On Tuesday 20 May 2014 23:20:07 Jason Gunthorpe wrote:
> On Fri, May 16, 2014 at 10:53:33AM +0100, Will Deacon wrote:
>
> > > Correct. Assume a PCI device uses PIO and DMA. It sends a DMA to main memory
> > > and lets the CPU know about the data using a level (IntA as opposed to MSI)
> > > interrupt. The CPU performs an outl() operation to an I/O port to let the
> > > hardware know it has received the IRQ and the response of the outl() is
> > > guaranteed to flush the DMA transaction: by the time the outl() completes
> > > we know that the data in memory is valid because it is strongly ordered
> > > relative to the DMA.
>
> Keep in mind that the IntA message itself is going to flush the DMA,
> no sane host bridge implementation should process the IntA until all
> prior DMA writes are completed, just like MSI.
I was thinking of PCI, not PCIe here, where the interrupt can be directly
wired to the irqchip.
> Also, legacy non-MSI interrupts are always sharable, so the ISR must
> always start with a read of a device specific status reguster, which
> will also flush any DMA writes.
Right, good point.
> The simplest common scenario to show synchronous outl is this:
>
> void pci_isr()
> {
> if (inl(status_reg) & INT_PENDING)
> outl(ACK_INT,status_reg);
> }
>
> Where the outl is not expected to complete at the CPU until the device
> has lowered the level triggered interrupt line.
>
> If outl is not synchronous then a spurious interrupt will be caused.
>
> When converting a driver to MMIO you'd often have to do this:
>
> void pci_isr()
> {
> if (readl(status_reg) & INT_PENDING) {
> writel(ACK_INT,status_reg);
> readl(status_reg); // Synchronizing read, flushes write.
> }
> }
>
> Which is one of the software visible impacts of io vs mmio.
>
> > Hmm, when you say `guaranteed to flush the DMA transaction', is that a PCI
> > requirement? If so, whether or not that DMA data is then visible to the CPU
> > is really specific to the host-controller implementation. It could easily be
> > buffered somewhere between the host controller and memory, for example.
>
> PCI has the producer/consumer ordering model as part of the
> driving concept in the spec. Basically it wants to see the ordering
> model preserved right to the driver code itself.
>
> Realistically, way back, archs that couldn't do the synchronous IO
> (like my old MIPS design) had to convert their drivers to MMIO and run
> that way. It never worked 100% properly, or made sense to try an use an
> async outl, even though some systems provided it
Thanks for the extra background information!
Arnd
More information about the linux-arm-kernel
mailing list