Issue found in Armada 370: "No buffer space available" error during continuous ping

Maggie Mae Roxas maggie.mae.roxas at gmail.com
Thu Jul 24 00:24:27 PDT 2014


Hi Willy,
Good day.

> I have no idea, I remember that this code is deeply burried into the
> original neta code. There was also a large code for the network
> classifier and something like buffer management in the original
> Marvell's driver if my memory serves me correctly, I have no idea
> if these ones set up anything special.
I also have no idea on this.

> Just the original ones. I have a mirabox with its original boot loader :
> U-Boot 2009.08 (Sep 16 2012 - 22:50:06)Marvell version: 1.1.2 NQ
This, I think, is a 2012_Q4 (QA) release if it's based on 2009.08.
We use a 2014_T1 (QA) release based on 2011.12.
# So we're using different boot loaders.

> I think we could try to dump all of our respective mvneta registers and
> compare them, though I have very little time for this today. And if it
> comes from extra SoC functions like buffer management or network classifier,
> I have no idea how they work nor what to dump :-/
Yes, I think the difference in bootloaders especially on mvneta
registers' values matter on our difference in behavior.

And indeed, it would take much of our time if we need to compare all
mvneta register dumps, and we'd need to forward those values directly
to a Marvell contact for more effective evaluation since it's more of
a hardware stuff. :-/

In any case, since performance and Tx interrupts were both still OK on
us, I think it's enough.
We'll just inform you if we notice some irregularities.

Thank you very much for discussing this issue with us! :-)

Until our future discussions,
Maggie Roxas

On Tue, Jul 22, 2014 at 11:16 PM, Willy Tarreau <w at 1wt.eu> wrote:
> Hi Maggie,
>
> On Tue, Jul 22, 2014 at 07:24:35PM -0700, Maggie Mae Roxas wrote:
>> Hi Willy,
>> Good day.
>>
>> > OK so clearly the issue must be found.
>>
>> Actually we have 2 products using Armada 370.
>> One has only 1 ethernet port, so it is expected to act as Client only.
>> The other one has 2 ethernet ports, so it's more router-like.
>>
>> For the product with one port, we have checked the combination patch
>> and it seems like Tx IRQ is increasing so it's OK. We checked this via
>> /proc/interrupts and mvneta's value there changed from 500000+ to
>> around 900000+ after we perform a 10-iteration iperf to the server.
>> The throughput is also OK, we're getting around 850Mbits when we use a
>> 1Gbit connection, which is roughly just the same as what we've been
>> experiencing when we're still using 3.10.x (even 3.2.x).
>
> OK.
>
>> As for the other product with two ports, we do expect that we might be
>> encountering the slow performance you mentioned.
>> But we are not focusing on this project yet so once it's active again,
>> I'll let you know.
>>
>> > Just thinking about something, do you have a custom boot loader ?
>> > It would be possible that in our case, the Tx IRQ works only because some
>> > obscure or undocumented bits are set by the boot loader and that in your
>> > case it's not pre-initialized.
>>
>> We are indeed using a "custom" boot loader.
>> We are using Marvell u-boot 2014_T1.1 (latest QA release, I think).
>> We applied some patches to memory (since we have 1Gb DDR), some bits
>> and pieces for the interfaces we're going to support and not to
>> support, and of course our own environment variables.
>> As for the DDR memory/register patches, they came directly from our
>> Marvell contact.
>>
>> But with what I mentioned above, I think our Tx interrupt is working...?
>
> Yes, seems so.
>
>> BTW, for both products we've designed from Armada 370 RD, we didn't
>> use a switch. So we removed all switch-related codes in the boot
>> loader.
>> I'm not sure if not having switch affects the behavior?
>
> I have no idea, I remember that this code is deeply burried into the
> original neta code. There was also a large code for the network
> classifier and something like buffer management in the original
> Marvell's driver if my memory serves me correctly, I have no idea
> if these ones set up anything special.
>
>> How about you? May I know what boot loader you are using?
>
> Just the original ones. I have a mirabox with its original boot loader :
>
>     U-Boot 2009.08 (Sep 16 2012 - 22:50:06)Marvell version: 1.1.2 NQ
>     U-Boot Addressing:
>            Code:            00600000:006AFFF0
>            BSS:             006F8E40
>            Stack:           0x5fff70
>            PageTable:       0x8e0000
>            Heap address:    0x900000:0xe00000
>     Board: DB-88F6710-BP
>     SoC:   MV6710 A1
>     CPU:   Marvell PJ4B v7 UP (Rev 1) LE
>            CPU @ 1200Mhz, L2 @ 600Mhz
>            DDR @ 600Mhz, TClock @ 200Mhz
>            DDR 16Bit Width, FastPath Memory Access
>     PEX 0: Detected No Link.
>     PEX 1: Root Complex Interface, Detected Link X1
>     DRAM:   1 GB
>            CS 0: base 0x00000000 size 512 MB
>            CS 1: base 0x20000000 size 512 MB
>            Addresses 14M - 0M are saved for the U-Boot usage.
>     NAND:  1024 MiB
>     Bad block table found at page 262016, version 0x01
>     Bad block table found at page 261888, version 0x01
>     FPU not initialized
>     USB 0: Host Mode
>     USB 1: Host Mode
>     Modules/Interfaces Detected:
>            RGMII0 Phy
>            RGMII1 Phy
>            PEX0 (Lane 0)
>            PEX1 (Lane 1)
>     phy16= 72
>     phy16= 72
>     MMC:   MRVL_MMC: 0
>     Net:   egiga0 [PRIME], egiga1
>     Hit any key to stop autoboot:  0
>
>> > LTS would probably even interest your customer as it's an LTS version.
>> > In this case, always pick the most recent one (3.14.12 today). You may
>> > even be interested in 3.15.6 which contains another phy fix supposed to
>> > fix cd71e2, but if you're saying that it doesn't change anything for you
>> > I guess it will have no effet (might be worth testing for the purpose of
>> > helping troubleshooting though).
>>
>> Thank you for this advise, we'll take note of this.
>> We plan to stick on using LTS from now on, as much as possible.
>>
>> > OK. I still have a hard time imagining how hardware itself could prevent
>> > an IRQ from being delivered from a NIC which is located inside the SoC,
>> > but there must be an explanation somewhere :-/
>> I also would like to know how. :-/
>> But maybe it's our difference in boot loader as you speculated.
>
> I think we could try to dump all of our respective mvneta registers and
> compare them, though I have very little time for this today. And if it
> comes from extra SoC functions like buffer management or network classifier,
> I have no idea how they work nor what to dump :-/
>
> Regards,
> Willy
>



More information about the linux-arm-kernel mailing list