Intel I350 mini-PCIe card (igb) on Mirabox (mvebu / Armada 370)

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Tue Apr 8 11:08:28 PDT 2014


On Tue, Apr 08, 2014 at 07:53:24PM +0200, Willy Tarreau wrote:
> Hi Jason,
> 
> On Tue, Apr 08, 2014 at 11:14:17AM -0600, Jason Gunthorpe wrote:
> > The mbus hardware requires a power of two size, if a non-power
> > of two is passed in to the low level routines they configure the
> > register in a way that results in undefined behaviour.
> > 
> > - WARN_ON to make this invalid usage really obvious
> > - Round down to the nearest power of two so we only stuff a valid
> >   size into the HW
> 
> So if I understand it right, for example when my first NIC registers
> e0000000-e02fffff, then we'll truncate it down to e0000000-e01fffff,
> won't that cause any issue, for example if the NIC needs to access
> data beyond that limit ?

The mbus windows are assigned to the bridge, not to the NIC bars:

00:0a.0 PCI bridge [0604]: Marvell Technology Group Ltd. Device [11ab:7846] (rev 02) (prog-if 00 [Normal decode])
        Memory behind bridge: e0000000-e17fffff

So the above is not OK, 17fffff is not a power of two. The patch
causes the mbus to WARN_ON and then round down so that the hardware is
at least in a defined configuration.

> BTW I can try your patch with the myricom NIC which started to work
> when rounding up, I'll quickly see if that fixes the issue as well,
> but I'm now a little bit confused.

This patch won't fix anything, it just fixes the mbus driver to always
program the hardware with valid values, even if the upper layer
requests something invalid. Basically, we spent a few weeks tracking
this behavior down, the patch would ensure that time doesn't get spent
again.

The goal now is to avoid the WARN_ON in the patch from firing.

For a proper fix, something like this to create multiple aligned
windows is a simple option (untested, need the inverse on the
de-register, loop should probably live in mbus code):

diff --git a/drivers/pci/host/pci-mvebu.c b/drivers/pci/host/pci-mvebu.c
index 789cdb2..7312c82 100644
--- a/drivers/pci/host/pci-mvebu.c
+++ b/drivers/pci/host/pci-mvebu.c
@@ -382,6 +382,9 @@ static void mvebu_pcie_handle_iobase_change(struct mvebu_pcie_port *port)
 
 static void mvebu_pcie_handle_membase_change(struct mvebu_pcie_port *port)
 {
+       phys_addr_t base;
+       unsigned int mapped_size;
+
        /* Are the new membase/memlimit values invalid? */
        if (port->bridge.memlimit < port->bridge.membase ||
            !(port->bridge.command & PCI_COMMAND_MEMORY)) {
@@ -408,8 +411,16 @@ static void mvebu_pcie_handle_membase_change(struct mvebu_pcie_port *port)
                (((port->bridge.memlimit & 0xFFF0) << 16) | 0xFFFFF) -
                port->memwin_base;
 
-       mvebu_mbus_add_window_by_id(port->mem_target, port->mem_attr,
-                                   port->memwin_base, port->memwin_size);
+       base = port->memwin_base;
+       mapped_size = 0;
+       while (mapped_size < port->memwin_size) {
+               unsigned int size =
+                   rounddown_pow_of_two(port->memwin_size - mapped_size);
+               mvebu_mbus_add_window_by_id(port->mem_target, port->mem_attr,
+                                           base, size);
+               base += size;
+               mapped_size += size;
+       }
 }
 

Jason



More information about the linux-arm-kernel mailing list