[BUG] bcm2711: bad_chained_irq in brcmstb_l2_intc_irq_handle
Florian Fainelli
f.fainelli at gmail.com
Sun Jan 16 12:15:53 PST 2022
Hi Stefan,
On 1/16/2022 9:26 AM, Stefan Wahren wrote:
> Hi,
>
> recently i saw a report [1] about bad chained IRQ with Linux 5.15.13
> Aarch64 with Arch Linux. I'm able to reproduce this issue on my
> Raspberry Pi 4 B (8 GB RAM, Firmware: 2022-01-06T15:39:30) by turning
> the connected HDMI monitor off and on again.
>
> Kernel output is the following:
>
> [15053.285438] irq 10, desc: 00000000acc41fca, depth: 0, count: 0,
> unhandled: 0
> [15053.295440] ->handle_irq(): 00000000b28cf1d1,
> brcmstb_l2_intc_irq_handle+0x0/0x1e0
> [15053.306049] ->irq_data.chip(): 000000005f172760, gic_data+0x0/0x768
> [15053.315233] ->action(): 00000000236e815e
> [15053.322022] ->action->handler(): 0000000013023289,
> bad_chained_irq+0x0/0x50
> [15053.331909] IRQ_LEVEL set
> [15053.337822] IRQ_NOPROBE set
> [15053.343715] IRQ_NOREQUEST set
> [15053.349585] IRQ_NOTHREAD set
OK, so those should have been cleared during the call to
irq_alloc_domain_generic_chips(), any clues why they are set here?
>
> Content of /proc/interrupts after the issue occured:
>
> CPU0 CPU1 CPU2 CPU3
> 9: 0 0 0 0 GICv2 25 Level
> vgic
> 10: 1 0 0 0 GICv2 128 Level
> (null)
This is suspicious, we should not have an interrupt registered here at
all since we call irq_set_chained_handler_and_data, this is a L2
interrupt controller, not a leaf interrupt with a descriptor. On all
Broadcom STB systems where this interrupt controller is used, I
definitively don't see any entries for the L2 output to the GIC in
/proc/interrupts because no interrupt descriptor is allocated.
This makes me wonder where this is coming from but the message about irq
10 being bad would make sense then.
> 12: 130322 26028 27670 135225 GICv2 30 Level
> arch_timer
> 13: 0 0 0 0 GICv2 27 Level
> kvm guest vtimer
> 19: 0 0 0 0 GICv2 107 Level
> fe004000.txp
> 20: 7450 0 0 0 GICv2 65 Level
> fe00b880.mailbox
> 25: 6525 0 0 0 GICv2 153 Level
> uart-pl011
> 26: 0 0 0 0 GICv2 149 Level
> fe205000.i2c, fe804000.i2c
> 27: 9 0 0 0 GICv2 125 Level
> ttyS1
> 28: 36999 0 0 0 GICv2 158 Level
> mmc0, mmc1
> 29: 1 0 0 0 GICv2 129 Level
> vc4 hvs
> 30: 0 0 0 0 GICv2 105 Level
> fe980000.usb, fe980000.usb
> 31: 0 0 0 0 GICv2 112 Level
> DMA IRQ
> 33: 0 0 0 0 GICv2 114 Level
> DMA IRQ
> 40: 0 0 0 0 GICv2 141 Level
> vc4 crtc
> 41: 0 0 0 0 GICv2 142 Level
> vc4 crtc, vc4 crtc
> 42: 10 0 0 0 GICv2 133 Level
> vc4 crtc
> 43: 1 0 0 0
> interrupt-controller at 7ef00100 0 Edge vc4 hdmi cec tx
> 44: 0 0 0 0
> interrupt-controller at 7ef00100 1 Edge vc4 hdmi cec rx
> 47: 0 0 0 0
> interrupt-controller at 7ef00100 4 Edge vc4 hdmi hpd connected
> 48: 1 0 0 0
> interrupt-controller at 7ef00100 5 Edge vc4 hdmi hpd disconnected
> 49: 0 0 0 0
> interrupt-controller at 7ef00100 8 Edge vc4 hdmi cec tx
> 50: 0 0 0 0
> interrupt-controller at 7ef00100 7 Edge vc4 hdmi cec rx
> 53: 0 0 0 0
> interrupt-controller at 7ef00100 10 Edge vc4 hdmi hpd connected
> 54: 0 0 0 0
> interrupt-controller at 7ef00100 11 Edge vc4 hdmi hpd disconnected
> 55: 7 0 0 0 GICv2 66 Level
> VCHIQ doorbell
> 56: 0 0 0 0 GICv2 48 Level
> arm-pmu
> 57: 0 0 0 0 GICv2 49 Level
> arm-pmu
> 58: 0 0 0 0 GICv2 50 Level
> arm-pmu
> 59: 0 0 0 0 GICv2 51 Level
> arm-pmu
> 62: 47599 0 0 0 GICv2 189 Level
> eth0
> 63: 4681 0 0 0 GICv2 190 Level
> eth0
> 64: 0 0 0 0 GICv2 175 Level
> PCIe PME, aerdrv
> 65: 326 0 0 0 BRCM STB PCIe MSI
> 524288 Edge xhci_hcd
> IPI0: 2442 5185 7195 18290 Rescheduling
> interrupts
> IPI1: 481 383 518 533 Function call
> interrupts
> IPI2: 0 0 0 0 CPU stop interrupts
> IPI3: 0 0 0 0 CPU stop (for
> crash dump) interrupts
> IPI4: 0 0 0 0 Timer broadcast
> interrupts
> IPI5: 1 0 0 0 IRQ work interrupts
> IPI6: 0 0 0 0 CPU wake-up
> interrupts
> Err: 1
>
> Comparing the vendor & mainline DTS, i noticed differences at hdmi0/1.
> The vendor DTS has an additional register to access the same space as
> aon_intr (interrupt parent), which looks ugly [2].
OK, so while I do see how we set external_irq_controller to let the
interrupt be managed externally from vc4_hdmi.c else do it internally, I
don't see the code that tries to map the "intr2" register space, let
alone fetch its resource, do you know where that is?
>
> Additionally i noted that bcm2711.dtsi uses the compatible
> "brcm,bcm2711-l2-intc" with a level high interrupt, but according to
> irq-brcmstb-l2.c [3] the compatible is not defined and would fallback to
> "brcm,l2-intc" with brcmstb_l2_edge_intc_of_init. This looks fishy.
The hardware level 2 controller at 0x7ef00100 is definitively the same
kind that supports edge interrupt flows since it has the CPU_STATUS,
CPU_CLEAR, CPU_MASK_STATUS, CPU_MASK_SET and CPU_MASK_CLEAR registers,
so it seems to me like declaring the interrupt controller of the kind
"brcm,l2-intc" and using the edge interrupt handling is not necessarily
a problem.
>
> I didn't try to reproduce this with Raspberry Pi OS & mainline kernel,
> but i hope these are enough information so far.
>
> [1] - https://archlinuxarm.org/forum/viewtopic.php?f=65&t=15791
>
> [2] -
> https://github.com/raspberrypi/linux/blob/rpi-5.15.y/arch/arm/boot/dts/bcm2711.dtsi#L339
>
> [3] -
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/irqchip/irq-brcmstb-l2.c?h=v5.15.15#n278
>
--
Florian
More information about the linux-arm-kernel
mailing list