dwc2: irq 66: nobody cared triggered on resume

Stefan Wahren wahrenst at gmx.net
Mon Jun 24 10:02:02 PDT 2024


Hi,

Am 23.06.24 um 15:27 schrieb Stefan Wahren:
> Hello Lukas,
>
> Am 22.06.24 um 20:47 schrieb Lukas Wunner:
>> On Sat, Jun 22, 2024 at 02:23:33PM +0200, Stefan Wahren wrote:
>>> i currently experiment with suspend to idle on the Raspberry Pi 3 A+.
>>> Supend & resume works expected as long as no USB device is connected to
>>> the board. If i connect a USB hub to the Pi, the resume phase is
>>> significantly delayed and the kernel disabled IRQ 66 which belongs
>>> to DWC2.
>> [...]
>>> [ 1131.109996] PM: noirq resume of devices complete after 1.273 msecs
>>> [ 1131.111208] PM: early resume of devices complete after 1.051 msecs
>>> [ 1131.230277] brcmfmac: brcmf_fw_alloc_request: using
>>> brcm/brcmfmac43455-sdio for chip BCM4345/6
>>> [ 1131.458687] irq 66: nobody cared (try booting with the "irqpoll"
>>> option)
>>> [ 1131.458714] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W
>>> 6.10.0-rc3-g7fd4227d1bd5-dirty #49
>>> [ 1131.458734] Hardware name: BCM2835
>>> [ 1131.458744] Call trace:
>> [...]
>>> [ 1131.458877] note_interrupt from handle_irq_event+0x88/0x8c
>>> [ 1131.458900] handle_irq_event from handle_level_irq+0xb4/0x1ac
>>> [ 1131.458923] handle_level_irq from
>>> generic_handle_domain_irq+0x24/0x34
>>> [ 1131.458957] generic_handle_domain_irq from
>>> bcm2836_chained_handle_irq+0x24/0x28
>>> [ 1131.458992] bcm2836_chained_handle_irq from
>>> generic_handle_domain_irq+0x24/0x34
>>> [ 1131.459024] generic_handle_domain_irq from
>>> generic_handle_arch_irq+0x34/0x44
>>> [ 1131.459056] generic_handle_arch_irq from __irq_svc+0x88/0xb0
>>> [ 1131.459079] Exception stack(0xc1b01f20 to 0xc1b01f68)
>>> [ 1131.459142] __irq_svc from default_idle_call+0x1c/0xb0
>>> [ 1131.459167] default_idle_call from do_idle+0x21c/0x284
>>> [ 1131.459202] do_idle from cpu_startup_entry+0x28/0x2c
>>> [ 1131.459239] cpu_startup_entry from kernel_init+0x0/0x12c
>>> [ 1131.459271] handlers:
>>> [ 1131.459279] [<f539e0f4>] dwc2_handle_common_intr
>>> [ 1131.459308] [<75cd278b>] usb_hcd_irq
>>> [ 1131.459329] Disabling IRQ #66
>> [...]
>>> An ideas what causing this issue?
>> Interrupts are re-enabled after the resume_noirq phase.  Looks like
>> the chip signals an interrupt right afterwards but the two hardirq
>> handlers do not feel responsible.
>>
>> The only option might be to add a few printk() in
>> dwc2_handle_common_intr(),
>> usb_hcd_irq() and dwc2_handle_hcd_intr() (called from usb_hcd_irq())
>> to see why they're all returning IRQ_NONE without clearing the source
>> of the interrupt.  The chip just keeps signaling interrupts because
>> the driver doesn't handle them, hence the IRQ storm which the IRQ core
>> eventually stops by outright disabling the interrupt.
> thanks for your suggestion. Unfortunately placing printk in those busy
> interrupt handler is futile, so i switched to debugfs. This issue
> would be much easier in case the interrupt wouldn't be shared. But
> first let me share some outputs before i start to extend debugfs further:
>
> 1. No hub connected to Rpi 3 A+
>
> root at raspberrypi:/sys/kernel/debug/usb/3f980000.usb# cat state
> DCFG=0x00000000, DCTL=0x00000000, DSTS=0x0007ff02
> DIEPMSK=0x00000000, DOEPMASK=0x00000000
> GINTMSK=0xf3000806, GINTSTS=0x04000023
> DAINTMSK=0x00000000, DAINT=0x00000000
> GNPTXSTS=0x00080100, GRXSTSR=3f83bbfe
>
> 2. Hub connected before suspend / irq issue
>
> DCFG=0x00000000, DCTL=0x00000000, DSTS=0x0007a202
> DIEPMSK=0x00000000, DOEPMASK=0x00000000
> GINTMSK=0xf300080e, GINTSTS=0x04000023
> DAINTMSK=0x00000000, DAINT=0x00000000
> GNPTXSTS=0x08080100, GRXSTSR=789a460a
>
> 3. Hub connected after suspend / irq issue
>
> DCFG=0x00000000, DCTL=0x00000000, DSTS=0x0007ff02
> DIEPMSK=0x00000000, DOEPMASK=0x00000000
> GINTMSK=0xf1000806, GINTSTS=0x0500002b
> DAINTMSK=0x000000ff, DAINT=0x00000000
> GNPTXSTS=0x29080100, GRXSTSR=befdf595
>
> Based on my limited knowledge and observations the issue seems related
> to GINTMSK/GINTSTS and a outstanding GINTSTS_PRTINT.
i narrowed this a little bit further. At least i know the reason for the
"nobody cared". It's clear that the issue is triggered by
GINTSTS_PRTINT. The DWC2 controller is in host mode so
dwc2_handle_common_intr() ignores the interrupt and returns IRQ_NONE.
But usb_hcd_irq() also cannot handle it because HCD_FLAG_HW_ACCESSIBLE
is still clear, so the handler also returns IRQ_NONE :-(

Is disabling the IRQ via the upper layers an expected behavior instead
of letting the DWC2 controller driver resolve the situation?

But back to the root cause. I followed the suspend/resume path, why the
HCD_FLAG_HW_ACCESSIBLE is not cleared.

Suspend path:

The power down is DWC2_POWER_DOWN_PARAM_NONE so the
HCD_FLAG_HW_ACCESSIBLE is cleared (
https://elixir.bootlin.com/linux/v6.10-rc3/source/drivers/usb/dwc2/hcd.c#L4385
).

Resume path:

During resume the HPRT0_CONNSTS flag is set, so the
HCD_FLAG_HW_ACCESSIBLE is not set (
https://elixir.bootlin.com/linux/v6.10-rc3/source/drivers/usb/dwc2/hcd.c#L4435
).

Is the reason for this behavior the lack of clock gating support on
BCM283x or is it a driver bug?

How can i figure out clock gating is supported?

Regards

>
> Regards




More information about the linux-arm-kernel mailing list