call trace on Khadas VIM Pro with AP6255 WiFi

Heiner Kallweit hkallweit1 at gmail.com
Wed Jun 7 14:04:24 PDT 2017


Am 07.06.2017 um 18:16 schrieb crow:
> Hi,
> 
> On Mon, Jun 5, 2017 at 11:56 PM, Heiner Kallweit <hkallweit1 at gmail.com> wrote:
>> Am 05.06.2017 um 20:18 schrieb Martin Blumenstingl:
>>> Hi Heiner,
>>>
>>> On Mon, Jun 5, 2017 at 6:12 PM, Heiner Kallweit <hkallweit1 at gmail.com> wrote:
>>>> Am 05.06.2017 um 01:03 schrieb Martin Blumenstingl:
>>>>> Hi Heiner,
>>>>>
>>>>> On Sun, Jun 4, 2017 at 9:11 PM, Heiner Kallweit <hkallweit1 at gmail.com> wrote:
>>>>>> Am 04.06.2017 um 17:59 schrieb crow:
>>>>>>> Hi,
>>>>>>> Posting to list as requested. I get call trace when using WiFi on Khadas VIM Pro, which comes with an AP6255. Firstly i wrote about this directly to Martin Blumenstingl, and he pointed me for this to post here.
>>>>>>> This looks like a bug in the meson-gx-mmc.c (SD/SDIO/eMMC) driver [1]. Martin last successful test was with kernel 2.11.
>>>>>>>
>>>>>>> How to reproduce this:
>>>>>>> mainline kernel 4.12.0-rc3 (using ArchLinuxARM arm8v generic rootfs)
>>>>>>> Khadas VIM Pro nvram file [2]
>>>>>> I have been in contact with others using AP6255-based SDIO WiFi with this driver. After the recent version of the driver
>>>>>> was released I didn't hear back from them so I assume WiFi is working fine for them.
>>>>> I've also seen the errors with older version of your patches where
>>>>> SDIO wifi would fail to initialize (or load the firmware).
>>>>> I just tried it myself on a Khadas VIM Pro (= the one with AP6255):
>>>>> - wifi firmware loads fine
>>>>> - as soon as I put traffic on the wireless interface (iperf3 -c <ip of
>>>>> server>) I hit the same WARN_ON as crow (it happens almost instantly
>>>>> after I start iperf3)
>>>>>
>>>>>> The firmware image you use seems to be for Android. Not sure which firmware the others are using.
>>>>>> This might be a potential reason for the issue.
>>>>>>
>>>>>>> Enable WiFI (connected to 5GHZ AP)
>>>>>>> just do an SSH to Khadas VIM Pro device and you will get these call trace [3] , eventually you get login prompt but it is very slow login.
>>>>>> The log shows that first HW reports a failed write transfer (status 0x0100) and then an unexpected interrupt occurs (host->cmd being NULL).
>>>>>> Apart from the firmware the failed transfer might also be caused by e.g. a too high frequency.
>>>>> # cat /sys/kernel/debug/mmc2/ios
>>>>> clock:          50000000 Hz
>>>>> actual clock:   50000000 Hz
>>>>> vdd:            21 (3.3 ~ 3.4 V)
>>>>> bus mode:       2 (push-pull)
>>>>> chip select:    0 (don't care)
>>>>> power mode:     2 (on)
>>>>> bus width:      2 (4 bits)
>>>>> timing spec:    2 (sd high-speed)
>>>>> signal voltage: 0 (3.30 V)
>>>>> driver type:    0 (driver type B)
>>>>>
>>>>> The .dts from Khadas (based on the Amlogic kernel) uses f_max 200000000
>>>>>
>>>>>> Would be interesting to hear what's the latest kernel version (or even better: a bisect) working for you.
>>>>> I don't really remember what the last actual working version was, but
>>>>> I can try to bisect it tomorrow
>>>>
>>>> This would be great.
>>> 79ed05e329c34c68ab2a87f729e666fcf208fdad is the first bad commit
>>> commit 79ed05e329c34c68ab2a87f729e666fcf208fdad
>>> Author: Heiner Kallweit <hkallweit1 at gmail.com>
>>> Date:   Fri Apr 7 21:22:44 2017 +0200
>>>
>>>     mmc: meson-gx: add support for descriptor chain mode
>>>
>>>     So far a bounce buffer is used to serialize the scatterlist(s).
>>>     This overhead can be avoided by switching to descriptor chain mode.
>>>     As result the performance is drastically improved. On a Odroid-C2 with
>>>     a 128 GB eMMC module raw reads reach 140 MB/s.
>>>
>>>     Prerequisite for descriptor chain mode is that all scatterlist buffers
>>>     are 8 byte aligned for 64-bit DMA. That's not always the case, at least
>>>     the brcmfmac SDIO WiFi driver is known to cause problems.
>>>
>>>     Therefore, for each request, check upfront whether all scatterlist
>>>     buffers are 8 byte aligned and fall back to bounce buffer mode if
>>>     that's not the case.
>>>
>>>     Signed-off-by: Heiner Kallweit <hkallweit1 at gmail.com>
>>>     Signed-off-by: Ulf Hansson <ulf.hansson at linaro.org>
>>>
>>> :040000 040000 c9771e14cfc70fa36ab67b6f4e246d714ff0e40d
>>> a5b816917ce3c7e0d7f2975ae116a94ef5b6db99 M      drivers
>>>
>>> $ git bisect log
>>> git bisect start
>>> # bad: [3c2993b8c6143d8a5793746a54eba8f86f95240f] Linux 4.12-rc4
>>> git bisect bad 3c2993b8c6143d8a5793746a54eba8f86f95240f
>>> # good: [94d765bbfb1910ef8db5699d19907a0756a99d03] mmc: meson-gx:
>>> simplify bounce buffer setting in meson_mmc_start_cmd
>>> git bisect good 94d765bbfb1910ef8db5699d19907a0756a99d03
>>> # bad: [16a12fa9aed176444fc795b09e796be41902bb08] Merge branch
>>> 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
>>> git bisect bad 16a12fa9aed176444fc795b09e796be41902bb08
>>> # good: [8d65b08debc7e62b2c6032d7fe7389d895b92cbc] Merge
>>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
>>> git bisect good 8d65b08debc7e62b2c6032d7fe7389d895b92cbc
>>> # good: [8d65b08debc7e62b2c6032d7fe7389d895b92cbc] Merge
>>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
>>> git bisect good 8d65b08debc7e62b2c6032d7fe7389d895b92cbc
>>> # good: [b769fefb68cd70385d68220ae341e5a10723fbc0] Backmerge tag
>>> 'v4.11-rc6' into drm-next
>>> git bisect good b769fefb68cd70385d68220ae341e5a10723fbc0
>>> # bad: [2f34c1231bfc9f2550f934acb268ac7315fb3837] Merge tag
>>> 'drm-for-v4.12' of git://people.freedesktop.org/~airlied/linux
>>> git bisect bad 2f34c1231bfc9f2550f934acb268ac7315fb3837
>>> # bad: [0302e28dee643932ee7b3c112ebccdbb9f8ec32c] Merge branch 'next'
>>> of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
>>> git bisect bad 0302e28dee643932ee7b3c112ebccdbb9f8ec32c
>>> # bad: [68fed41e0ff6c0332520a0d70ac05be2a7d9130e] Merge tag
>>> 'pinctrl-v4.12-1' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
>>> git bisect bad 68fed41e0ff6c0332520a0d70ac05be2a7d9130e
>>> # good: [ec5ab8933772c87f24ad62a4a602fe8949f423c2] mmc: host:
>>> omap_hsmmc: checking for NULL instead of IS_ERR()
>>> git bisect good ec5ab8933772c87f24ad62a4a602fe8949f423c2
>>> # skip: [8d4264d86c1a31d7e0efd3e4c3e08bc91dbdb48c] Merge tag
>>> 'samsung-pinctrl-4.12' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/pinctrl/samsung into
>>> devel
>>> git bisect skip 8d4264d86c1a31d7e0efd3e4c3e08bc91dbdb48c
>>> # bad: [e145ac451eb68b51e0ede4c131bd5a539fb675b6] mmc: sdhci-of-esdhc:
>>> poll ESDHC_CLOCK_STABLE bit with udelay
>>> git bisect bad e145ac451eb68b51e0ede4c131bd5a539fb675b6
>>> # bad: [83b600b8de0e83661a8f40091a7966fac052a21b] mmc: sdhci: Control
>>> the delay between tuning commands
>>> git bisect bad 83b600b8de0e83661a8f40091a7966fac052a21b
>>> # bad: [c34d1579de4665d9613691bc37aebb7ca86d535d] mmc: cavium: Remove
>>> redundant pointer check
>>> git bisect bad c34d1579de4665d9613691bc37aebb7ca86d535d
>>> # good: [c08bcb6c90068d8eacffdb8b5f77916de5481a72] mmc: meson-gx:
>>> introduce struct meson_tuning_params
>>> git bisect good c08bcb6c90068d8eacffdb8b5f77916de5481a72
>>> # good: [0b6ed71c3e95abf6cd1dbfc53919defdce3449f5] mmc: meson-gx: add
>>> basic tuning for rx clock phase
>>> git bisect good 0b6ed71c3e95abf6cd1dbfc53919defdce3449f5
>>> # bad: [79ed05e329c34c68ab2a87f729e666fcf208fdad] mmc: meson-gx: add
>>> support for descriptor chain mode
>>> git bisect bad 79ed05e329c34c68ab2a87f729e666fcf208fdad
>>> # first bad commit: [79ed05e329c34c68ab2a87f729e666fcf208fdad] mmc:
>>> meson-gx: add support for descriptor chain mode
>>>
>>> I manually applied [0] in all steps, otherwise there was an infinite
>>> stream of "brcmfmac: brcmf_sdio_rxglom: sublen 84 not multiple of 8"
>>> messages
>>>
>>> testing steps:
>>> - ifconfig eth0 down
>>> - <start wifi>
>>> - iperf3 -c <server ip>
>>> - iperf3 -c <server ip> -R
>>>
>>>>>
>>>>>> I'm using a Odroid-C2 where SDIO isn't available, so I'm not able to test.
>>>>> are you from Germany? I can give you a Khadas VIM so you can test this
>>>>> yourself for a few weeks if you want
>>>>>
>>>>> what just caught my eye though is the max_req_size calculation: [0]
>>>>> Amlogic's kernel uses max_req_size = <0x20000>; /**128KB*/ for the SD
>>>>> and SDIO controllers, but max_req_size = <0x20000>; /**256KB*/ for the
>>>>> eMMC controller
>>>>> it caught my eye because (some time ago) I found out that there are
>>>>> two different hardware buffer sizes for the UART FIFO buffer (see page
>>>>> 309 of the public S905 datasheet, UART0 has a 128 byte buffer, while
>>>>> all other UARTs only have a 64 byte buffer)
>>>>>
>>>> In the logs provided by crow the issue happened with small writes
>>>> (just one or two blocks), so most likely the problem here has a
>>>> different nature.
>>>>
>>>> In general I'm wondering whether the mainline Meson GX mmc driver ever
>>>> fully supported SDIO. For example SDIO interrupts aren't supported.
>>> actually I think SDIO interrupts are the only "SDIO specific" feature
>>> that is missing
>>> but it's not as important as one thinks, since we could even simulate
>>> this with yours (and Jerome's and Carlo's) GPIO interrupts instead
>>>
>>> I don't see anything wrong with SDIO wifi support "before" the broken commit:
>>> Accepted connection from <Khadas VIM IP>, port 34078
>>> [  5] local <server IP> port 5201 connected to <Khadas VIM IP> port 34080
>>> [ ID] Interval           Transfer     Bandwidth
>>> [  5]   0.00-1.00   sec  3.34 MBytes  28.0 Mbits/sec
>>> [  5]   1.00-2.00   sec  5.20 MBytes  43.6 Mbits/sec
>>> [  5]   2.00-3.00   sec  5.13 MBytes  43.1 Mbits/sec
>>> [  5]   3.00-4.00   sec  7.32 MBytes  61.4 Mbits/sec
>>> [  5]   4.00-5.00   sec  3.99 MBytes  33.5 Mbits/sec
>>> [  5]   5.00-6.00   sec  7.27 MBytes  60.9 Mbits/sec
>>> [  5]   6.00-7.00   sec  3.62 MBytes  30.3 Mbits/sec
>>> [  5]   7.00-8.00   sec  6.00 MBytes  50.3 Mbits/sec
>>> [  5]   8.00-9.00   sec  4.48 MBytes  37.6 Mbits/sec
>>> [  5]   9.00-10.00  sec  4.41 MBytes  37.0 Mbits/sec
>>> [  5]  10.00-10.05  sec   390 KBytes  69.8 Mbits/sec
>>> - - - - - - - - - - - - - - - - - - - - - - - - -
>>> [ ID] Interval           Transfer     Bandwidth
>>> [  5]   0.00-10.05  sec  0.00 Bytes  0.00 bits/sec                  sender
>>> [  5]   0.00-10.05  sec  51.1 MBytes  42.7 Mbits/sec                  receiver
>>> -----------------------------------------------------------
>>> Server listening on 5201
>>> -----------------------------------------------------------
>>> Accepted connection from <Khadas VIM IP>, port 34082
>>> [  5] local <server IP> port 5201 connected to <Khadas VIM IP> port 34084
>>> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
>>> [  5]   0.00-1.00   sec  6.03 MBytes  50.5 Mbits/sec   22    180 KBytes
>>> [  5]   1.00-2.00   sec  3.29 MBytes  27.6 Mbits/sec   47    189 KBytes
>>> [  5]   2.00-3.00   sec  5.59 MBytes  46.9 Mbits/sec   26    204 KBytes
>>> [  5]   3.00-4.00   sec  4.54 MBytes  38.1 Mbits/sec    0    219 KBytes
>>> [  5]   4.00-5.00   sec  5.84 MBytes  49.0 Mbits/sec   22    229 KBytes
>>> [  5]   5.00-6.00   sec  5.03 MBytes  42.2 Mbits/sec   18    239 KBytes
>>> [  5]   6.00-7.00   sec  5.97 MBytes  50.0 Mbits/sec    1    256 KBytes
>>> [  5]   7.00-8.00   sec  6.15 MBytes  51.6 Mbits/sec    0    274 KBytes
>>> [  5]   8.00-9.00   sec  4.54 MBytes  38.1 Mbits/sec   12    269 KBytes
>>> [  5]   9.00-10.00  sec  6.77 MBytes  56.8 Mbits/sec    4    296 KBytes
>>> [  5]  10.00-10.07  sec   636 KBytes  74.2 Mbits/sec    0    297 KBytes
>>> - - - - - - - - - - - - - - - - - - - - - - - - -
>>> [ ID] Interval           Transfer     Bandwidth       Retr
>>> [  5]   0.00-10.07  sec  54.4 MBytes  45.3 Mbits/sec  152             sender
>>> [  5]   0.00-10.07  sec  0.00 Bytes  0.00 bits/sec                  receiver
>>>
>> Thanks a lot for the analysis!
>>
>> Somewhat strange that descriptor chain mode works fine with SD/eMMC but causes
>> issues with SDIO. Sounds like the controller might have a problem.
>> The vendor SDIO driver includes at least one quirk:
>> https://github.com/hardkernel/linux/blob/odroidc2-3.14.y/drivers/amlogic/mmc/aml_sdio.c#L157
>> But this doesn't seem to be the problem here.
>>
>> The Meson GX mmc driver allows already to fall back to bounce buffer mode,
>> so far this is used in case of a unaligned sg dma buffer. As a possible
>> workaround we could try the same fallback if command opcode is SD_IO_RW_EXTENDED.
>> (as crow's log shows issues with this comamnd only)
>>
>> Could you please check whether the following fixes SDIO?
>>
>> diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
>> index 1842ed34..c51c37ea 100644
>> --- a/drivers/mmc/host/meson-gx-mmc.c
>> +++ b/drivers/mmc/host/meson-gx-mmc.c
>> @@ -210,6 +210,9 @@ static void meson_mmc_get_transfer_mode(struct mmc_host *mmc,
>>         int i;
>>         bool use_desc_chain_mode = true;
>>
>> +       if (mrq->cmd->opcode == SD_IO_RW_EXTENDED)
>> +               return;
>> +
>>         for_each_sg(data->sg, sg, data->sg_len, i)
>>                 /* check for 8 byte alignment */
>>                 if (sg->offset & 7) {
>>
>> Rgds, Heiner
>>
> 
> I can confirm that, it does work with fix your provided. There is no
> Kernel panic anymore here, and I was able to SSH to device, but there
> is high lag there. Probably due the "sublen 1532 not multiple of 8"
> which Martin already mentioned, so I am compiling Kernel with that
> patch to check if that will fix the lag.
> 
Thanks a lot for the report. Great that it fixed SDIO, although I don't
really have a good explanation why descriptor chain mode is failing
with SDIO but not with SD/eMMC.

Warning "sublen xxx not multiple of 8" is harmless and doesn't cause
any re-sending of packets or similar.
However printing the warning with high frequency mighte cause a slowdown.
See also this log line: net_ratelimit: 2037 callbacks suppressed

Curious to hear whether the patch removing this error message improves
WiFi responsiveness.

Rgds, Heiner

> In the link [0] you can find whole dmesg, at the end there is info
> that Ethernet cable was unplugged and then WiFi message about that
> sublen. No information that WiFi was enabled/start but for sure it was
> working (saw info about dhcpcd on wlan0 in journalctl ).
> 
> Regards,
> 
>>>
>>>> Kevin:
>>>> Before submitting the initial version of the driver, did you test also
>>>> SDIO or just SD/eMMC?
>>>>
>>>>>
>>>>> Regards,
>>>>> Martin
>>>>>
>>>>> [0] https://github.com/torvalds/linux/blob/master/drivers/mmc/host/meson-gx-mmc.c#L971
>>>>>
>>>>
>>>
>>> [0] https://patchwork.kernel.org/patch/9725563/
>>>
>>
> 
> [0] https://defuse.ca/b/0VnfDe3S
> 




More information about the linux-amlogic mailing list