[BUG] cqe unable to handle buggy cards
Adrian Hunter
adrian.hunter at intel.com
Mon Nov 16 03:21:32 EST 2020
On 6/11/20 10:54 pm, Peter Geis wrote:
> On Wed, Nov 4, 2020 at 7:39 PM Peter Geis <pgwipeout at gmail.com> wrote:
>>
>> On Mon, Nov 2, 2020 at 11:32 AM Adrian Hunter <adrian.hunter at intel.com> wrote:
>>>
>>> On 2/11/20 4:54 pm, Ulf Hansson wrote:
>>>> + cqhci maintainers
>>>>
>>>> On Sat, 31 Oct 2020 at 13:54, Peter Geis <pgwipeout at gmail.com> wrote:
>>>>>
>>>>> Good Morning,
>>>>>
>>>>> We are seeing an issue on the rk3399 with certain Foresee emmc modules
>>>>> where the module reports it supports command queuing but fails in actual
>>>>> implementation.
>>>>>
>>>>> Unfortunately there doesn't seem to be any method for the mmc core code
>>>>> to detect this situation and disable command queue automatically.
>>>>> There also appears to be no way to disable it at runtime.
>>>
>>> Since v5.5, if you know how to use SDHCI debug quirks there is
>>> SDHCI_QUIRK_BROKEN_CQE
>>> e.g. kernel command line option sdhci.debug_quirks=0x0x2020000
>>
>> Thank you, we will test this.
>
> This does resolve the issue of disabling cqe entirely for debugging, thanks!
>
>>
>>>
>>>>>
>>>>> Certain modified kernels have added a patch to enable runtime disable of
>>>>> command queue entirely, but this will affect mmc core as a whole and not
>>>>> just the buggy card.
>>>>>
>>>>> Does anyone have any insight into this issue?
>>>>> Thank you for your time.
>>>>
>>>> Unfortunate, not me personally. I assume the issue is either be card
>>>> specific or host specific. Before looking at a disable option, we need
>>>> to know more about what goes wrong, I think.
>>>>
>>>> Kind regards
>>>> Uffe
>>>>
>>>>>
>>>>> Very Respectfully,
>>>>> Peter Geis
>>>>>
>>>>> [ 64.472882] mmc2: cqhci: timeout for tag 2
>>>>> [ 64.473349] mmc2: cqhci: ============ CQHCI REGISTER DUMP ===========
>>>>> [ 64.474057] mmc2: cqhci: Caps: 0x00000000 | Version: 0x00000510
>>>>> [ 64.474763] mmc2: cqhci: Config: 0x00000000 | Control: 0x00000000
>>>>> [ 64.475468] mmc2: cqhci: Int stat: 0x00000000 | Int enab: 0x00000000
>>>>> [ 64.476172] mmc2: cqhci: Int sig: 0x00000000 | Int Coal: 0x00000000
>>>>> [ 64.476875] mmc2: cqhci: TDL base: 0x00000000 | TDL up32: 0x00000000
>>>
>>> TDL base cannot be zero, so the register values have been lost.
>>> Could be a reset issue like this one but for sdhci-of-arasan.c :
>>>
>>> https://lore.kernel.org/linux-mmc/20200819121848.16967-1-adrian.hunter@intel.com/
>>
>> Excellent, we will see if a similar implementation makes a difference
>> here for us as well.
>
> I wrote a patch to implement this in the arasan driver.
> https://paste.ee/p/cl5SX
> Nuumiofi tested it with the buggy card.
>
> The good news, it solves the register clearing issue.
> The bad news, the card still is broken, here is the kernel log:
> https://paste.ubuntu.com/p/sF2yMwxpcV/
You will need to enable dynamic debug to get more messages e.g.
kernel config: CONFIG_DYNAMIC_DEBUG=y
kernel command line:
dyndbg="file drivers/mmc/core/* +p;file drivers/mmc/host/* +p"
If it can't be made to work, you can set SDHCI_QUIRK_BROKEN_CQE in the
driver as needed.
>
>>
>>>
>>>
>>>>> [ 64.477578] mmc2: cqhci: Doorbell: 0x00000000 | TCN: 0x00000000
>>>>> [ 64.478281] mmc2: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x00000000
>>>>> [ 64.478984] mmc2: cqhci: Task clr: 0x00000000 | SSC1: 0x00011000
>>>>> [ 64.479687] mmc2: cqhci: SSC2: 0x00000000 | DCMD rsp: 0x00000000
>>>>> [ 64.489785] mmc2: cqhci: RED mask: 0xfdf9a080 | TERRI: 0x00000000
>>>>> [ 64.499774] mmc2: cqhci: Resp idx: 0x00000000 | Resp arg: 0x00000000
>>>>> [ 64.509687] mmc2: sdhci: ============ SDHCI REGISTER DUMP ===========
>>>>> [ 64.519597] mmc2: sdhci: Sys addr: 0x00000000 | Version: 0x00001002
>>>>> [ 64.529521] mmc2: sdhci: Blk size: 0x00007200 | Blk cnt: 0x00000000
>>>>> [ 64.539440] mmc2: sdhci: Argument: 0x00010000 | Trn mode: 0x00000010
>>>>> [ 64.549352] mmc2: sdhci: Present: 0x1fff0000 | Host ctl: 0x00000034
>>>>> [ 64.559277] mmc2: sdhci: Power: 0x0000000b | Blk gap: 0x00000080
>>>>> [ 64.569214] mmc2: sdhci: Wake-up: 0x00000000 | Clock: 0x00000007
>>>>> [ 64.579061] mmc2: sdhci: Timeout: 0x0000000e | Int stat: 0x00000000
>>>>> [ 64.588842] mmc2: sdhci: Int enab: 0x02ff4000 | Sig enab: 0x02ff4000
>>>>> [ 64.598671] mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
>>>>> [ 64.608446] mmc2: sdhci: Caps: 0x44edc880 | Caps_1: 0x800020f7
>>>>> [ 64.618161] mmc2: sdhci: Cmd: 0x00000d1a | Max curr: 0x00000000
>>>>> [ 64.627801] mmc2: sdhci: Resp[0]: 0x00000900 | Resp[1]: 0x642017d7
>>>>> [ 64.637376] mmc2: sdhci: Resp[2]: 0x4e436172 | Resp[3]: 0x00880103
>>>>> [ 64.646855] mmc2: sdhci: Host ctl2: 0x00000083
>>>>> [ 64.656080] mmc2: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0xf0628208
>>>>> [ 64.665445] mmc2: sdhci: ============================================
>>>>> [ 64.674998] mmc2: running CQE recovery
>>>>>
>>>>> [ 125.912941] mmc2: cqhci: timeout for tag 3
>>>>> [ 125.921978] mmc2: cqhci: ============ CQHCI REGISTER DUMP ===========
>>>>> [ 125.931200] mmc2: cqhci: Caps: 0x00000000 | Version: 0x00000510
>>>>> [ 125.940389] mmc2: cqhci: Config: 0x00000001 | Control: 0x00000000
>>>>> [ 125.949499] mmc2: cqhci: Int stat: 0x00000000 | Int enab: 0x00000006
>>>>> [ 125.958527] mmc2: cqhci: Int sig: 0x00000006 | Int Coal: 0x00000000
>>>>> [ 125.967486] mmc2: cqhci: TDL base: 0x00000000 | TDL up32: 0x00000000
>>>>> [ 125.976260] mmc2: cqhci: Doorbell: 0x00000000 | TCN: 0x00000000
>>>>> [ 125.985065] mmc2: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x00000000
>>>>> [ 125.993698] mmc2: cqhci: Task clr: 0x00000000 | SSC1: 0x00011000
>>>>> [ 126.002244] mmc2: cqhci: SSC2: 0x00000000 | DCMD rsp: 0x00000000
>>>>> [ 126.010716] mmc2: cqhci: RED mask: 0xfdf9a080 | TERRI: 0x00000000
>>>>> [ 126.019159] mmc2: cqhci: Resp idx: 0x00000000 | Resp arg: 0x00000000
>>>>> [ 126.027525] mmc2: sdhci: ============ SDHCI REGISTER DUMP ===========
>>>>> [ 126.035955] mmc2: sdhci: Sys addr: 0x00000000 | Version: 0x00001002
>>>>> [ 126.044258] mmc2: sdhci: Blk size: 0x00007200 | Blk cnt: 0x00000000
>>>>> [ 126.052396] mmc2: sdhci: Argument: 0x00000001 | Trn mode: 0x00000010
>>>>> [ 126.060370] mmc2: sdhci: Present: 0x1fff0000 | Host ctl: 0x00000034
>>>>> [ 126.068241] mmc2: sdhci: Power: 0x0000000b | Blk gap: 0x00000080
>>>>> [ 126.075978] mmc2: sdhci: Wake-up: 0x00000000 | Clock: 0x00000007
>>>>> [ 126.083552] mmc2: sdhci: Timeout: 0x0000000e | Int stat: 0x00000000
>>>>> [ 126.090937] mmc2: sdhci: Int enab: 0x02ff4000 | Sig enab: 0x02ff4000
>>>>> [ 126.098219] mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
>>>>> [ 126.105403] mmc2: sdhci: Caps: 0x44edc880 | Caps_1: 0x800020f7
>>>>> [ 126.112649] mmc2: sdhci: Cmd: 0x00003013 | Max curr: 0x00000000
>>>>> [ 126.119700] mmc2: sdhci: Resp[0]: 0x00400800 | Resp[1]: 0x642017d7
>>>>> [ 126.126594] mmc2: sdhci: Resp[2]: 0x4e436172 | Resp[3]: 0x00880103
>>>>> [ 126.133334] mmc2: sdhci: Host ctl2: 0x00000083
>>>>> [ 126.139652] mmc2: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0xf0628208
>>>>> [ 126.146008] mmc2: sdhci: ============================================
>>>>> [ 126.152361] mmc2: running CQE recovery
>>>>>
>>>>>
>>>
More information about the Linux-rockchip
mailing list