Issue found in Armada 370: "No buffer space available" error during continuous ping

Willy Tarreau w at 1wt.eu
Sun Jul 20 22:44:05 PDT 2014


Hi Maggie,

On Sun, Jul 20, 2014 at 07:45:13PM -0700, Maggie Mae Roxas wrote:
> Hi Willy,
> Good day.
> 
> BTW, here are some answers to your questions.
> 
> > In fact I don't know if you're running your own board or a "standard"
> one (a mirabox or any NAS board).
> We are using a "customized" one, not a "standard" one.
> We based the design on Armada 370 RD Evaluation Board, but we used
> Marvell 88E1512 as Ethernet PHY and Marvell 88F6707 as processor
> instead of the ones in the Armada 370 RD (I think it uses Marvell
> 88E1310 as Ethernet PHY and Marvell 88F6W11 as processor).

OK. For information, the mirabox uses a 88F6707 and a 88E1510 for
the phy. So it's very close to what you have.

> > Maggie, do you know if it is possible that for any reason your board
> would not deliver an IRQ on Tx completion ? That could explain things.
> > You can easily test reverting commit 4f3a4f701b just in case.
> > If that's the case, then the next step will be to figure out how it is possible
> that IRQs are disabled!
> After reverting 4f3a4f701b, as I reported, issue does not happen anymore.

As you said that you both applied cd71e2 and reverted 4f3a4f, could you
please confirm that with cd71 applied only it was not enough ? I'm
finding it really strange, because as you use the same CPU as the
mirabox, I'm seeing no reason why the IRQ wouldn't work, and since
you're using a slightly different phy from us, the first patch which
changes the the RGMII configuration (cd71e2) would be a more likely
candidate.

> Please let me know how to "figure out how it is possible that IRQs are
> disabled".

Checking /proc/interrupts when you're sending some traffic should show
that the IRQ is increasing from time to time.

> Also, what is the impact if I use this combination?

First you're not using a mainline kernel which means that you'll always
be bothered. Second, removing support for the Tx IRQ means that your
Tx traffic can become very slow (typically 134 Mbps instead of 987 for
unidirectional traffic), which can be a problem if your board is used
as a router for example. If you're building a NAS, you'll have less
impact. Third, considering that other boards work without applying
these changes, it might be possible that there's an issue on your
board, and maybe detecting it early would allow you to fix it for all
future batches, and maybe only apply these patches for the few very
first ones.

Regards,
Willy




More information about the linux-arm-kernel mailing list