Low network throughput on i.MX28

Jörg Krause joerg.krause at embedded.rocks
Sun Nov 20 01:14:35 PST 2016


Hi Stefan,

On Sat, 2016-11-19 at 12:36 +0100, Stefan Wahren wrote:
> Hi Jörg,
> 
> > Jörg Krause <joerg.krause at embedded.rocks> hat am 19. November 2016
> > um 00:49
> > geschrieben:
> > 
> > 
> > Hi all,
> > 
> > [snip]
> > 
> > I did some time measurements on the wifi, mmc and dma driver to
> > compare
> > the performance between the vendor and the mainline kernel. For
> > this I
> > toggled some GPIOs and measured the time difference with an osci. I
> > started measuring the time before calling sdio_readsb() in the wifi
> > driver [1] and stopped the time when the call returns. Note that
> > the
> > time was only measured for a packet length of 1536 bytes.
> > 
> > The vendor kernel took about 250 us to return whereas the mainline
> > kernel took about 325 us. To investigate where this additional time
> > comes from I divided the whole procedure into seperate parts and
> > compared their time consumed.
> > 
> > I noticed that the mainline kernel does took much longer to return
> > after the DMA request is done, signalled in this case by calling
> > mxs_mmc_dma_irq_callback() [2] in the mxs-mmc driver. From here it
> > takes about 150 us to get back to sdio_readsb().
> > 
> > An example for consuming much more time is the mainline mmc driver
> > where it hangs in mmc_wait_done() [2] about 50 us just calling
> > complete(), whereas the vendor mmc driver almost immediately
> > returns
> > here.
> > 
> > I wonder why this call to complete consumes so much time? Any
> > ideas?
> 
> i don't know why, but how about putting the SDIO clk signal parallel
> to the
> GPIOs at your osci? So could get a better view of the runtime
> behavior.

Unfortunately, the board layout does not allow me to access the SDIO
pins.

The main question for me is, why the mmc core driver needs around 120
us beginning from calling complete() in mmc_wait_done() [1] until
receiving the completion signal in mmc_wait_for_req_done() [2]. Why
does signaling the completion consumes so much time?

For comparision, the time to do the mmc request (preparing request,
preparing DMA, doing DMA, waiting, reading response, starting signal
completion) takes about 215 us, whereas just sending the signal that
completion is done takes 120 us. For me this issue is the bottleneck.

Does anyone has an idea what may be responsible that signaling the
completion is so slow?

[1] http://lxr.free-electrons.com/source/drivers/mmc/core/core.c#L386
[2] http://lxr.free-electrons.com/source/drivers/mmc/core/core.c#L492

> Btw you should also verify the necessary time between to 2 packets.
> 
> Stefan
> 
> > 
> > [1] http://lxr.free-electrons.com/source/drivers/net/wireless/broad
> > com/
> > brcm80211/brcmfmac/bcmsdh.c#L488
> > 
> > [2] http://lxr.free-electrons.com/source/drivers/mmc/host/mxs-mmc.c
> > #L17
> > 9
> > 
> > [3] http://lxr.free-electrons.com/source/drivers/mmc/core/core.c#L3
> > 86
> > 
> > Best regards,
> > Jörg Krause



More information about the linux-arm-kernel mailing list