call trace on Khadas VIM Pro with AP6255 WiFi

Heiner Kallweit hkallweit1 at gmail.com
Wed Jun 7 22:43:04 PDT 2017


Am 08.06.2017 um 02:03 schrieb crow:
> Hi,
> 
> On Wed, Jun 7, 2017 at 11:04 PM, Heiner Kallweit <hkallweit1 at gmail.com> wrote:
>> Am 07.06.2017 um 18:16 schrieb crow:
>>> Hi,
>>>
>>> On Mon, Jun 5, 2017 at 11:56 PM, Heiner Kallweit <hkallweit1 at gmail.com> wrote:
>>>> Am 05.06.2017 um 20:18 schrieb Martin Blumenstingl:
>>>>> Hi Heiner,
>>>>>
>>>>> On Mon, Jun 5, 2017 at 6:12 PM, Heiner Kallweit <hkallweit1 at gmail.com> wrote:
>>>>>> Am 05.06.2017 um 01:03 schrieb Martin Blumenstingl:
>>>>>>> Hi Heiner,
>>>>>>>
>>>>>>> On Sun, Jun 4, 2017 at 9:11 PM, Heiner Kallweit <hkallweit1 at gmail.com> wrote:
>>>>>>>> Am 04.06.2017 um 17:59 schrieb crow:
>>>>>>>>> Hi,
>>>>>>>>> Posting to list as requested. I get call trace when using WiFi on Khadas VIM Pro, which comes with an AP6255. Firstly i wrote about this directly to Martin Blumenstingl, and he pointed me for this to post here.
>>>>>>>>> This looks like a bug in the meson-gx-mmc.c (SD/SDIO/eMMC) driver [1]. Martin last successful test was with kernel 2.11.
>>>>>>>>>
>>>>>>>>> How to reproduce this:
>>>>>>>>> mainline kernel 4.12.0-rc3 (using ArchLinuxARM arm8v generic rootfs)
>>>>>>>>> Khadas VIM Pro nvram file [2]
>>>>>>>> I have been in contact with others using AP6255-based SDIO WiFi with this driver. After the recent version of the driver
>>>>>>>> was released I didn't hear back from them so I assume WiFi is working fine for them.
>>>>>>> I've also seen the errors with older version of your patches where
>>>>>>> SDIO wifi would fail to initialize (or load the firmware).
>>>>>>> I just tried it myself on a Khadas VIM Pro (= the one with AP6255):
>>>>>>> - wifi firmware loads fine
>>>>>>> - as soon as I put traffic on the wireless interface (iperf3 -c <ip of
>>>>>>> server>) I hit the same WARN_ON as crow (it happens almost instantly
>>>>>>> after I start iperf3)
>>>>>>>
>>>>>>>> The firmware image you use seems to be for Android. Not sure which firmware the others are using.
>>>>>>>> This might be a potential reason for the issue.
>>>>>>>>
>>>>>>>>> Enable WiFI (connected to 5GHZ AP)
>>>>>>>>> just do an SSH to Khadas VIM Pro device and you will get these call trace [3] , eventually you get login prompt but it is very slow login.
>>>>>>>> The log shows that first HW reports a failed write transfer (status 0x0100) and then an unexpected interrupt occurs (host->cmd being NULL).
>>>>>>>> Apart from the firmware the failed transfer might also be caused by e.g. a too high frequency.
>>>>>>> # cat /sys/kernel/debug/mmc2/ios
>>>>>>> clock:          50000000 Hz
>>>>>>> actual clock:   50000000 Hz
>>>>>>> vdd:            21 (3.3 ~ 3.4 V)
>>>>>>> bus mode:       2 (push-pull)
>>>>>>> chip select:    0 (don't care)
>>>>>>> power mode:     2 (on)
>>>>>>> bus width:      2 (4 bits)
>>>>>>> timing spec:    2 (sd high-speed)
>>>>>>> signal voltage: 0 (3.30 V)
>>>>>>> driver type:    0 (driver type B)
>>>>>>>
>>>>>>> The .dts from Khadas (based on the Amlogic kernel) uses f_max 200000000
>>>>>>>
>>>>>>>> Would be interesting to hear what's the latest kernel version (or even better: a bisect) working for you.
>>>>>>> I don't really remember what the last actual working version was, but
>>>>>>> I can try to bisect it tomorrow
>>>>>>
>>>>>> This would be great.
>>>>> 79ed05e329c34c68ab2a87f729e666fcf208fdad is the first bad commit
>>>>> commit 79ed05e329c34c68ab2a87f729e666fcf208fdad
>>>>> Author: Heiner Kallweit <hkallweit1 at gmail.com>
>>>>> Date:   Fri Apr 7 21:22:44 2017 +0200
>>>>>
>>>>>     mmc: meson-gx: add support for descriptor chain mode
>>>>>
>>>>>     So far a bounce buffer is used to serialize the scatterlist(s).
>>>>>     This overhead can be avoided by switching to descriptor chain mode.
>>>>>     As result the performance is drastically improved. On a Odroid-C2 with
>>>>>     a 128 GB eMMC module raw reads reach 140 MB/s.
>>>>>
>>>>>     Prerequisite for descriptor chain mode is that all scatterlist buffers
>>>>>     are 8 byte aligned for 64-bit DMA. That's not always the case, at least
>>>>>     the brcmfmac SDIO WiFi driver is known to cause problems.
>>>>>
>>>>>     Therefore, for each request, check upfront whether all scatterlist
>>>>>     buffers are 8 byte aligned and fall back to bounce buffer mode if
>>>>>     that's not the case.
>>>>>
>>>>>     Signed-off-by: Heiner Kallweit <hkallweit1 at gmail.com>
>>>>>     Signed-off-by: Ulf Hansson <ulf.hansson at linaro.org>
>>>>>
>>>>> :040000 040000 c9771e14cfc70fa36ab67b6f4e246d714ff0e40d
>>>>> a5b816917ce3c7e0d7f2975ae116a94ef5b6db99 M      drivers
>>>>>
>>>>> $ git bisect log
>>>>> git bisect start
>>>>> # bad: [3c2993b8c6143d8a5793746a54eba8f86f95240f] Linux 4.12-rc4
>>>>> git bisect bad 3c2993b8c6143d8a5793746a54eba8f86f95240f
>>>>> # good: [94d765bbfb1910ef8db5699d19907a0756a99d03] mmc: meson-gx:
>>>>> simplify bounce buffer setting in meson_mmc_start_cmd
>>>>> git bisect good 94d765bbfb1910ef8db5699d19907a0756a99d03
>>>>> # bad: [16a12fa9aed176444fc795b09e796be41902bb08] Merge branch
>>>>> 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
>>>>> git bisect bad 16a12fa9aed176444fc795b09e796be41902bb08
>>>>> # good: [8d65b08debc7e62b2c6032d7fe7389d895b92cbc] Merge
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
>>>>> git bisect good 8d65b08debc7e62b2c6032d7fe7389d895b92cbc
>>>>> # good: [8d65b08debc7e62b2c6032d7fe7389d895b92cbc] Merge
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
>>>>> git bisect good 8d65b08debc7e62b2c6032d7fe7389d895b92cbc
>>>>> # good: [b769fefb68cd70385d68220ae341e5a10723fbc0] Backmerge tag
>>>>> 'v4.11-rc6' into drm-next
>>>>> git bisect good b769fefb68cd70385d68220ae341e5a10723fbc0
>>>>> # bad: [2f34c1231bfc9f2550f934acb268ac7315fb3837] Merge tag
>>>>> 'drm-for-v4.12' of git://people.freedesktop.org/~airlied/linux
>>>>> git bisect bad 2f34c1231bfc9f2550f934acb268ac7315fb3837
>>>>> # bad: [0302e28dee643932ee7b3c112ebccdbb9f8ec32c] Merge branch 'next'
>>>>> of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
>>>>> git bisect bad 0302e28dee643932ee7b3c112ebccdbb9f8ec32c
>>>>> # bad: [68fed41e0ff6c0332520a0d70ac05be2a7d9130e] Merge tag
>>>>> 'pinctrl-v4.12-1' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
>>>>> git bisect bad 68fed41e0ff6c0332520a0d70ac05be2a7d9130e
>>>>> # good: [ec5ab8933772c87f24ad62a4a602fe8949f423c2] mmc: host:
>>>>> omap_hsmmc: checking for NULL instead of IS_ERR()
>>>>> git bisect good ec5ab8933772c87f24ad62a4a602fe8949f423c2
>>>>> # skip: [8d4264d86c1a31d7e0efd3e4c3e08bc91dbdb48c] Merge tag
>>>>> 'samsung-pinctrl-4.12' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/pinctrl/samsung into
>>>>> devel
>>>>> git bisect skip 8d4264d86c1a31d7e0efd3e4c3e08bc91dbdb48c
>>>>> # bad: [e145ac451eb68b51e0ede4c131bd5a539fb675b6] mmc: sdhci-of-esdhc:
>>>>> poll ESDHC_CLOCK_STABLE bit with udelay
>>>>> git bisect bad e145ac451eb68b51e0ede4c131bd5a539fb675b6
>>>>> # bad: [83b600b8de0e83661a8f40091a7966fac052a21b] mmc: sdhci: Control
>>>>> the delay between tuning commands
>>>>> git bisect bad 83b600b8de0e83661a8f40091a7966fac052a21b
>>>>> # bad: [c34d1579de4665d9613691bc37aebb7ca86d535d] mmc: cavium: Remove
>>>>> redundant pointer check
>>>>> git bisect bad c34d1579de4665d9613691bc37aebb7ca86d535d
>>>>> # good: [c08bcb6c90068d8eacffdb8b5f77916de5481a72] mmc: meson-gx:
>>>>> introduce struct meson_tuning_params
>>>>> git bisect good c08bcb6c90068d8eacffdb8b5f77916de5481a72
>>>>> # good: [0b6ed71c3e95abf6cd1dbfc53919defdce3449f5] mmc: meson-gx: add
>>>>> basic tuning for rx clock phase
>>>>> git bisect good 0b6ed71c3e95abf6cd1dbfc53919defdce3449f5
>>>>> # bad: [79ed05e329c34c68ab2a87f729e666fcf208fdad] mmc: meson-gx: add
>>>>> support for descriptor chain mode
>>>>> git bisect bad 79ed05e329c34c68ab2a87f729e666fcf208fdad
>>>>> # first bad commit: [79ed05e329c34c68ab2a87f729e666fcf208fdad] mmc:
>>>>> meson-gx: add support for descriptor chain mode
>>>>>
>>>>> I manually applied [0] in all steps, otherwise there was an infinite
>>>>> stream of "brcmfmac: brcmf_sdio_rxglom: sublen 84 not multiple of 8"
>>>>> messages
>>>>>
>>>>> testing steps:
>>>>> - ifconfig eth0 down
>>>>> - <start wifi>
>>>>> - iperf3 -c <server ip>
>>>>> - iperf3 -c <server ip> -R
>>>>>
>>>>>>>
>>>>>>>> I'm using a Odroid-C2 where SDIO isn't available, so I'm not able to test.
>>>>>>> are you from Germany? I can give you a Khadas VIM so you can test this
>>>>>>> yourself for a few weeks if you want
>>>>>>>
>>>>>>> what just caught my eye though is the max_req_size calculation: [0]
>>>>>>> Amlogic's kernel uses max_req_size = <0x20000>; /**128KB*/ for the SD
>>>>>>> and SDIO controllers, but max_req_size = <0x20000>; /**256KB*/ for the
>>>>>>> eMMC controller
>>>>>>> it caught my eye because (some time ago) I found out that there are
>>>>>>> two different hardware buffer sizes for the UART FIFO buffer (see page
>>>>>>> 309 of the public S905 datasheet, UART0 has a 128 byte buffer, while
>>>>>>> all other UARTs only have a 64 byte buffer)
>>>>>>>
>>>>>> In the logs provided by crow the issue happened with small writes
>>>>>> (just one or two blocks), so most likely the problem here has a
>>>>>> different nature.
>>>>>>
>>>>>> In general I'm wondering whether the mainline Meson GX mmc driver ever
>>>>>> fully supported SDIO. For example SDIO interrupts aren't supported.
>>>>> actually I think SDIO interrupts are the only "SDIO specific" feature
>>>>> that is missing
>>>>> but it's not as important as one thinks, since we could even simulate
>>>>> this with yours (and Jerome's and Carlo's) GPIO interrupts instead
>>>>>
>>>>> I don't see anything wrong with SDIO wifi support "before" the broken commit:
>>>>> Accepted connection from <Khadas VIM IP>, port 34078
>>>>> [  5] local <server IP> port 5201 connected to <Khadas VIM IP> port 34080
>>>>> [ ID] Interval           Transfer     Bandwidth
>>>>> [  5]   0.00-1.00   sec  3.34 MBytes  28.0 Mbits/sec
>>>>> [  5]   1.00-2.00   sec  5.20 MBytes  43.6 Mbits/sec
>>>>> [  5]   2.00-3.00   sec  5.13 MBytes  43.1 Mbits/sec
>>>>> [  5]   3.00-4.00   sec  7.32 MBytes  61.4 Mbits/sec
>>>>> [  5]   4.00-5.00   sec  3.99 MBytes  33.5 Mbits/sec
>>>>> [  5]   5.00-6.00   sec  7.27 MBytes  60.9 Mbits/sec
>>>>> [  5]   6.00-7.00   sec  3.62 MBytes  30.3 Mbits/sec
>>>>> [  5]   7.00-8.00   sec  6.00 MBytes  50.3 Mbits/sec
>>>>> [  5]   8.00-9.00   sec  4.48 MBytes  37.6 Mbits/sec
>>>>> [  5]   9.00-10.00  sec  4.41 MBytes  37.0 Mbits/sec
>>>>> [  5]  10.00-10.05  sec   390 KBytes  69.8 Mbits/sec
>>>>> - - - - - - - - - - - - - - - - - - - - - - - - -
>>>>> [ ID] Interval           Transfer     Bandwidth
>>>>> [  5]   0.00-10.05  sec  0.00 Bytes  0.00 bits/sec                  sender
>>>>> [  5]   0.00-10.05  sec  51.1 MBytes  42.7 Mbits/sec                  receiver
>>>>> -----------------------------------------------------------
>>>>> Server listening on 5201
>>>>> -----------------------------------------------------------
>>>>> Accepted connection from <Khadas VIM IP>, port 34082
>>>>> [  5] local <server IP> port 5201 connected to <Khadas VIM IP> port 34084
>>>>> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
>>>>> [  5]   0.00-1.00   sec  6.03 MBytes  50.5 Mbits/sec   22    180 KBytes
>>>>> [  5]   1.00-2.00   sec  3.29 MBytes  27.6 Mbits/sec   47    189 KBytes
>>>>> [  5]   2.00-3.00   sec  5.59 MBytes  46.9 Mbits/sec   26    204 KBytes
>>>>> [  5]   3.00-4.00   sec  4.54 MBytes  38.1 Mbits/sec    0    219 KBytes
>>>>> [  5]   4.00-5.00   sec  5.84 MBytes  49.0 Mbits/sec   22    229 KBytes
>>>>> [  5]   5.00-6.00   sec  5.03 MBytes  42.2 Mbits/sec   18    239 KBytes
>>>>> [  5]   6.00-7.00   sec  5.97 MBytes  50.0 Mbits/sec    1    256 KBytes
>>>>> [  5]   7.00-8.00   sec  6.15 MBytes  51.6 Mbits/sec    0    274 KBytes
>>>>> [  5]   8.00-9.00   sec  4.54 MBytes  38.1 Mbits/sec   12    269 KBytes
>>>>> [  5]   9.00-10.00  sec  6.77 MBytes  56.8 Mbits/sec    4    296 KBytes
>>>>> [  5]  10.00-10.07  sec   636 KBytes  74.2 Mbits/sec    0    297 KBytes
>>>>> - - - - - - - - - - - - - - - - - - - - - - - - -
>>>>> [ ID] Interval           Transfer     Bandwidth       Retr
>>>>> [  5]   0.00-10.07  sec  54.4 MBytes  45.3 Mbits/sec  152             sender
>>>>> [  5]   0.00-10.07  sec  0.00 Bytes  0.00 bits/sec                  receiver
>>>>>
>>>> Thanks a lot for the analysis!
>>>>
>>>> Somewhat strange that descriptor chain mode works fine with SD/eMMC but causes
>>>> issues with SDIO. Sounds like the controller might have a problem.
>>>> The vendor SDIO driver includes at least one quirk:
>>>> https://github.com/hardkernel/linux/blob/odroidc2-3.14.y/drivers/amlogic/mmc/aml_sdio.c#L157
>>>> But this doesn't seem to be the problem here.
>>>>
>>>> The Meson GX mmc driver allows already to fall back to bounce buffer mode,
>>>> so far this is used in case of a unaligned sg dma buffer. As a possible
>>>> workaround we could try the same fallback if command opcode is SD_IO_RW_EXTENDED.
>>>> (as crow's log shows issues with this comamnd only)
>>>>
>>>> Could you please check whether the following fixes SDIO?
>>>>
>>>> diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
>>>> index 1842ed34..c51c37ea 100644
>>>> --- a/drivers/mmc/host/meson-gx-mmc.c
>>>> +++ b/drivers/mmc/host/meson-gx-mmc.c
>>>> @@ -210,6 +210,9 @@ static void meson_mmc_get_transfer_mode(struct mmc_host *mmc,
>>>>         int i;
>>>>         bool use_desc_chain_mode = true;
>>>>
>>>> +       if (mrq->cmd->opcode == SD_IO_RW_EXTENDED)
>>>> +               return;
>>>> +
>>>>         for_each_sg(data->sg, sg, data->sg_len, i)
>>>>                 /* check for 8 byte alignment */
>>>>                 if (sg->offset & 7) {
>>>>
>>>> Rgds, Heiner
>>>>
>>>
>>> I can confirm that, it does work with fix your provided. There is no
>>> Kernel panic anymore here, and I was able to SSH to device, but there
>>> is high lag there. Probably due the "sublen 1532 not multiple of 8"
>>> which Martin already mentioned, so I am compiling Kernel with that
>>> patch to check if that will fix the lag.
>>>
>> Thanks a lot for the report. Great that it fixed SDIO, although I don't
>> really have a good explanation why descriptor chain mode is failing
>> with SDIO but not with SD/eMMC.
>>
>> Warning "sublen xxx not multiple of 8" is harmless and doesn't cause
>> any re-sending of packets or similar.
>> However printing the warning with high frequency mighte cause a slowdown.
>> See also this log line: net_ratelimit: 2037 callbacks suppressed
>>
>> Curious to hear whether the patch removing this error message improves
>> WiFi responsiveness.
> 
> Input lag is still there. Nothing is written anymore to dmesg from
> "sublen xxx not multiple of 8" or from Kernel panic. Just typing "cd"
> will give you input lag when using SSH over WiFI. Doing same over
> Ethernet is working just fine without any input lag. Also note that i
> am using ArchLinuxArm on micro SD card and not eMMC storage. Maybe
> this have something with that what Martin wrote about "max_req_size
> calculation" but all that is out of my knowledge.
> 
> I testet iperf3 like Martin did and here are results (AP/router <> Khadas VIM)
> -----------------------------------------------------------
> Accepted connection from <Khadas VIM IP>, port 58774
> [  5] local <server IP> port 5201 connected to <Khadas VIM IP> port 58776
> [ ID] Interval           Transfer     Bandwidth
> [  5]   0.00-1.00   sec  4.96 MBytes  41.6 Mbits/sec
> [  5]   1.00-2.00   sec  5.20 MBytes  43.6 Mbits/sec
> [  5]   2.00-3.00   sec  5.12 MBytes  43.0 Mbits/sec
> [  5]   3.00-4.00   sec  5.35 MBytes  44.8 Mbits/sec
> [  5]   4.00-5.00   sec  5.29 MBytes  44.4 Mbits/sec
> [  5]   5.00-6.00   sec  5.12 MBytes  43.0 Mbits/sec
> [  5]   6.00-7.00   sec  5.19 MBytes  43.5 Mbits/sec
> [  5]   7.00-8.00   sec  5.10 MBytes  42.8 Mbits/sec
> [  5]   8.00-9.00   sec  5.22 MBytes  43.8 Mbits/sec
> [  5]   9.00-10.00  sec  5.10 MBytes  42.8 Mbits/sec
> [  5]  10.00-10.02  sec  97.6 KBytes  43.8 Mbits/sec
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  5]   0.00-10.02  sec  51.9 MBytes  43.5 Mbits/sec    0             sender
> [  5]   0.00-10.02  sec  51.7 MBytes  43.3 Mbits/sec                  receiver
> -----------------------------------------------------------
> Server listening on 5201
> -----------------------------------------------------------
> Accepted connection from <Khadas VIM IP>, port 58778
> [  5] local <server IP> port 5201 connected to <Khadas VIM IP> port 58780
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  5]   0.00-1.00   sec  5.87 MBytes  49.2 Mbits/sec    0    417 KBytes
> [  5]   1.00-2.00   sec  5.83 MBytes  48.9 Mbits/sec    0    417 KBytes
> [  5]   2.00-3.00   sec  5.81 MBytes  48.8 Mbits/sec    0    417 KBytes
> [  5]   3.00-4.00   sec  5.88 MBytes  49.4 Mbits/sec    0    417 KBytes
> [  5]   4.00-5.00   sec  5.82 MBytes  48.8 Mbits/sec    0    417 KBytes
> [  5]   5.00-6.00   sec  5.89 MBytes  49.4 Mbits/sec    0    417 KBytes
> [  5]   6.00-7.00   sec  5.79 MBytes  48.6 Mbits/sec    0    417 KBytes
> [  5]   7.00-8.00   sec  5.96 MBytes  50.0 Mbits/sec    0    417 KBytes
> [  5]   8.00-9.00   sec  5.77 MBytes  48.4 Mbits/sec    0    417 KBytes
> [  5]   9.00-10.00  sec  5.74 MBytes  48.2 Mbits/sec    0    417 KBytes
> [  5]  10.00-10.03  sec   110 KBytes  31.6 Mbits/sec    0    417 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  5]   0.00-10.03  sec  58.5 MBytes  48.9 Mbits/sec    0             sender
> [  5]   0.00-10.03  sec  58.5 MBytes  48.9 Mbits/sec                  receiver
> -----------------------------------------------------------
> 
Thanks for re-testing. Then I will submit the workaround fixing SDIO.

Lagging is a quite frequent issue with WiFi and can have very different reasons.
I'm not sure we can blame the SDIO host driver for this in the case here.
E.g. RPi3 uses the same brcmfmac driver with other SDIO host driver and there
are also lot of complaints about poor Wifi performance.

> Regards,
> 
>> Rgds, Heiner
>>
>>> In the link [0] you can find whole dmesg, at the end there is info
>>> that Ethernet cable was unplugged and then WiFi message about that
>>> sublen. No information that WiFi was enabled/start but for sure it was
>>> working (saw info about dhcpcd on wlan0 in journalctl ).
>>>
>>> Regards,
>>>
>>>>>
>>>>>> Kevin:
>>>>>> Before submitting the initial version of the driver, did you test also
>>>>>> SDIO or just SD/eMMC?
>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Martin
>>>>>>>
>>>>>>> [0] https://github.com/torvalds/linux/blob/master/drivers/mmc/host/meson-gx-mmc.c#L971
>>>>>>>
>>>>>>
>>>>>
>>>>> [0] https://patchwork.kernel.org/patch/9725563/
>>>>>
>>>>
>>>
>>> [0] https://defuse.ca/b/0VnfDe3S
>>>
>>
> 




More information about the linux-amlogic mailing list