PXA168 + 88W8686 SDIO interrupt troubles
Doug Brown
doug at schmorgal.com
Thu Jun 9 00:50:47 PDT 2022
Hi there,
I don't know how active this list is anymore, especially given how old
the devices used by this driver are, but I figured I'd give it a shot!
I'm working with an old PXA168-based device that has a Marvell 8686 WiFi
chip connected over 4-bit SDIO. I've been slowly bringing everything up
to date so that it runs on newer kernels. Right now I'm working with
5.15. This is all for fun...I keep telling myself anyway!
I'm running into a really weird problem, and I'm having some trouble
understanding where it's coming from. The problem is that I'm noticing
that the SDIO card IRQ sometimes doesn't fire. This results in a
timeout, and the libertas driver getting completely out of sync with
sequence numbers off by one, and eventually the card just gets removed:
[ 43.842830] libertas_sdio mmc1:0001:1 wlan0: command 0x0024 timed out
[ 43.850436] libertas_sdio mmc1:0001:1 wlan0: Timeout submitting
command 0x0024
[ 43.861429] libertas_sdio mmc1:0001:1 wlan0: PREP_CMD: command 0x0024
failed: -110
[ 43.870470] libertas_sdio: Resetting card...
[ 46.882885] libertas_sdio mmc1:0001:1 wlan0: command 0x0021 timed out
[ 46.893907] libertas_sdio mmc1:0001:1 wlan0: Timeout submitting
command 0x0021
[ 46.907086] libertas_sdio mmc1:0001:1 wlan0: Received CMD_RESP with
invalid sequence 43 (expected 44)
[ 49.922837] libertas_sdio mmc1:0001:1 wlan0: command 0x0024 timed out
[ 49.929487] libertas_sdio mmc1:0001:1 wlan0: Timeout submitting
command 0x0024
[ 49.938191] libertas_sdio mmc1:0001:1 wlan0: PREP_CMD: command 0x0024
failed: -110
[ 49.952933] libertas_sdio mmc1:0001:1 wlan0: Received CMD_RESP with
invalid sequence 44 (expected 45)
[ 52.962828] libertas_sdio mmc1:0001:1 wlan0: command 0x0010 timed out
[ 52.969568] libertas_sdio mmc1:0001:1 wlan0: Timeout submitting
command 0x0010
[ 53.153993] mmc1: card 0001 removed
By adding a bunch of debugging printks and tracing with the
libertas_debug kernel param, it has become apparent that the host
controller is sometimes losing SDIO card interrupts, and the libertas
driver doesn't recover gracefully when it happens. The following
sequence of events happens right after association:
- sdhci_irq detects a SDHCI_INT_CARD_INT interrupt
- Card interrupts are disabled in the SDHCI host controller
- if_sdio_interrupt in the libertas driver runs, clears the card irq by
writing to IF_SDIO_H_INT_STATUS, and then handles the irq
- Card interrupts are re-enabled in the SDHCI host controller
- Even though a new interrupt is definitely ready to go according to
the IF_SDIO_H_INT_STATUS register, the SDHCI IRQ handler never
detects a new SDHCI_INT_CARD_INT interrupt
- The timeout occurs, and then suddenly the libertas driver realizes
there was a response waiting, but now it's out of sync because it
thinks the last command failed and timed out even though it didn't.
I'm guessing this missed interrupt can't be caused by the libertas
driver, can it? It feels like some kind of a hardware bug with the
PXA168's SDHC controller, like it's treating card interrupts as edge
interrupts instead of level interrupts, but I don't know how to prove
it. There's no public errata available for the PXA168 because it's
Marvell. I've messed around with the interrupt logic in the Linux SDHCI
code to try to work around this, but I can't figure out a tweak to make
it work correctly. I'm struggling to understand what is causing this.
Has anyone else run into this or something like it? I feel like I should
be able to find something in Marvell's old 2.6.28 kernel for working
around this, but I'm not finding anything. Maybe the problem was always
there, but the old Marvell driver automatically recovered gracefully
from the timeout after a missed IRQ?
Going back on the mailing list archives, it seems like there was often a
struggle to even get this driver to use interrupt mode because a lot of
host controllers didn't support interrupts back then. Mine supposedly
does...but it's not working correctly. If I hack the SDHC controller
code to disable MMC_CAP_SDIO_IRQ and MMC_CAP2_SDIO_IRQ_NOTHREAD, the
libertas driver works perfectly with polling on this setup, so I know
the driver works fine when it's not losing IRQs.
Thanks in advance for any ideas/advice, if anyone's still out there!
More information about the libertas-dev
mailing list