[BUG] bcm2711: bad_chained_irq in brcmstb_l2_intc_irq_handle
Maxime Ripard
maxime at cerno.tech
Thu Jan 20 07:39:28 PST 2022
Hi Stefan,
On Sun, Jan 16, 2022 at 06:26:58PM +0100, Stefan Wahren wrote:
> recently i saw a report [1] about bad chained IRQ with Linux 5.15.13
> Aarch64 with Arch Linux. I'm able to reproduce this issue on my
> Raspberry Pi 4 B (8 GB RAM, Firmware: 2022-01-06T15:39:30) by turning
> the connected HDMI monitor off and on again.
By turning the monitor on and off, you mean that you used the power
button on it? Not something like disabling the output in sysfs, right?
> Kernel output is the following:
>
> [15053.285438] irq 10, desc: 00000000acc41fca, depth: 0, count: 0,
> unhandled: 0
> [15053.295440] ->handle_irq(): 00000000b28cf1d1,
> brcmstb_l2_intc_irq_handle+0x0/0x1e0
> [15053.306049] ->irq_data.chip(): 000000005f172760, gic_data+0x0/0x768
> [15053.315233] ->action(): 00000000236e815e
> [15053.322022] ->action->handler(): 0000000013023289,
> bad_chained_irq+0x0/0x50
> [15053.331909] IRQ_LEVEL set
> [15053.337822] IRQ_NOPROBE set
> [15053.343715] IRQ_NOREQUEST set
> [15053.349585] IRQ_NOTHREAD set
IRQ10 is the interrupt that a monitor has been connected on HDMI1, which
makes sense if you were using HDMI1. Usually, when a display is turned
on, it will issue a pulse on the HPD line so we would have a
disconnection interrupt followed by a connection interrupt.
This is weird though, since we have an interrupt handler on that
interrupt (hpd-connected in the DT binding):
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/vc4/vc4_hdmi.c#L1578
> Content of /proc/interrupts after the issue occured:
>
> CPU0 CPU1 CPU2 CPU3
> 9: 0 0 0 0 GICv2 25 Level
> vgic
> 10: 1 0 0 0 GICv2 128 Level
> (null)
> 12: 130322 26028 27670 135225 GICv2 30 Level
> arch_timer
> 13: 0 0 0 0 GICv2 27 Level
> kvm guest vtimer
> 19: 0 0 0 0 GICv2 107 Level
> fe004000.txp
> 20: 7450 0 0 0 GICv2 65 Level
> fe00b880.mailbox
> 25: 6525 0 0 0 GICv2 153 Level
> uart-pl011
> 26: 0 0 0 0 GICv2 149 Level
> fe205000.i2c, fe804000.i2c
> 27: 9 0 0 0 GICv2 125 Level
> ttyS1
> 28: 36999 0 0 0 GICv2 158 Level
> mmc0, mmc1
> 29: 1 0 0 0 GICv2 129 Level
> vc4 hvs
> 30: 0 0 0 0 GICv2 105 Level
> fe980000.usb, fe980000.usb
> 31: 0 0 0 0 GICv2 112 Level
> DMA IRQ
> 33: 0 0 0 0 GICv2 114 Level
> DMA IRQ
> 40: 0 0 0 0 GICv2 141 Level
> vc4 crtc
> 41: 0 0 0 0 GICv2 142 Level
> vc4 crtc, vc4 crtc
> 42: 10 0 0 0 GICv2 133 Level
> vc4 crtc
> 43: 1 0 0 0
> interrupt-controller at 7ef00100 0 Edge vc4 hdmi cec tx
> 44: 0 0 0 0
> interrupt-controller at 7ef00100 1 Edge vc4 hdmi cec rx
> 47: 0 0 0 0
> interrupt-controller at 7ef00100 4 Edge vc4 hdmi hpd connected
> 48: 1 0 0 0
> interrupt-controller at 7ef00100 5 Edge vc4 hdmi hpd disconnected
> 49: 0 0 0 0
> interrupt-controller at 7ef00100 8 Edge vc4 hdmi cec tx
> 50: 0 0 0 0
> interrupt-controller at 7ef00100 7 Edge vc4 hdmi cec rx
> 53: 0 0 0 0
> interrupt-controller at 7ef00100 10 Edge vc4 hdmi hpd connected
> 54: 0 0 0 0
And it's there as well.
> interrupt-controller at 7ef00100 11 Edge vc4 hdmi hpd disconnected
> 55: 7 0 0 0 GICv2 66 Level
> VCHIQ doorbell
> 56: 0 0 0 0 GICv2 48 Level
> arm-pmu
> 57: 0 0 0 0 GICv2 49 Level
> arm-pmu
> 58: 0 0 0 0 GICv2 50 Level
> arm-pmu
> 59: 0 0 0 0 GICv2 51 Level
> arm-pmu
> 62: 47599 0 0 0 GICv2 189 Level
> eth0
> 63: 4681 0 0 0 GICv2 190 Level
> eth0
> 64: 0 0 0 0 GICv2 175 Level
> PCIe PME, aerdrv
> 65: 326 0 0 0 BRCM STB PCIe MSI
> 524288 Edge xhci_hcd
> IPI0: 2442 5185 7195 18290 Rescheduling
> interrupts
> IPI1: 481 383 518 533 Function call
> interrupts
> IPI2: 0 0 0 0 CPU stop interrupts
> IPI3: 0 0 0 0 CPU stop (for
> crash dump) interrupts
> IPI4: 0 0 0 0 Timer broadcast
> interrupts
> IPI5: 1 0 0 0 IRQ work interrupts
> IPI6: 0 0 0 0 CPU wake-up
> interrupts
> Err: 1
>
> Comparing the vendor & mainline DTS, i noticed differences at hdmi0/1.
> The vendor DTS has an additional register to access the same space as
> aon_intr (interrupt parent), which looks ugly [2].
This is an artifact from the past. We used to use that register directly
in our driver before we went to upstream the CEC support, but we don't
anymore. The DT patch must have been carried around since then, but
nothing should be using it.
> Additionally i noted that bcm2711.dtsi uses the compatible
> "brcm,bcm2711-l2-intc" with a level high interrupt, but according to
> irq-brcmstb-l2.c [3] the compatible is not defined and would fallback to
> "brcm,l2-intc" with brcmstb_l2_edge_intc_of_init. This looks fishy.
>
> I didn't try to reproduce this with Raspberry Pi OS & mainline kernel,
> but i hope these are enough information so far.
I don't remember anyone reporting this before, and I have tested the
disconnection / connection interrupts myself a number of times without
ever seeing this. The level vs edge stuff might be a good explanation
Maxime
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20220120/5912762f/attachment.sig>
More information about the linux-arm-kernel
mailing list