[BUG] bcm2711: bad_chained_irq in brcmstb_l2_intc_irq_handle

Florian Fainelli f.fainelli at gmail.com
Sun Jan 16 20:59:15 PST 2022



On 1/16/2022 12:15 PM, Florian Fainelli wrote:
> Hi Stefan,
> 
> On 1/16/2022 9:26 AM, Stefan Wahren wrote:
>> Hi,
>>
>> recently i saw a report [1] about bad chained IRQ with Linux 5.15.13
>> Aarch64 with Arch Linux. I'm able to reproduce this issue on my
>> Raspberry Pi 4 B (8 GB RAM, Firmware: 2022-01-06T15:39:30) by turning
>> the connected HDMI monitor off and on again.
>>
>> Kernel output is the following:
>>
>> [15053.285438] irq 10, desc: 00000000acc41fca, depth: 0, count: 0,
>> unhandled: 0
>> [15053.295440] ->handle_irq():  00000000b28cf1d1,
>> brcmstb_l2_intc_irq_handle+0x0/0x1e0
>> [15053.306049] ->irq_data.chip(): 000000005f172760, gic_data+0x0/0x768
>> [15053.315233] ->action(): 00000000236e815e
>> [15053.322022] ->action->handler(): 0000000013023289,
>> bad_chained_irq+0x0/0x50
>> [15053.331909]      IRQ_LEVEL set
>> [15053.337822]    IRQ_NOPROBE set
>> [15053.343715]  IRQ_NOREQUEST set
>> [15053.349585]   IRQ_NOTHREAD set
> 
> OK, so those should have been cleared during the call to 
> irq_alloc_domain_generic_chips(), any clues why they are set here?
> 
>>
>> Content of /proc/interrupts after the issue occured:
>>
>>             CPU0       CPU1       CPU2       CPU3
>>    9:          0          0          0          0     GICv2  25 Level
>> vgic
>>   10:          1          0          0          0     GICv2 128 Level
>> (null)
> 
> This is suspicious, we should not have an interrupt registered here at 
> all since we call irq_set_chained_handler_and_data, this is a L2 
> interrupt controller, not a leaf interrupt with a descriptor. On all 
> Broadcom STB systems where this interrupt controller is used, I 
> definitively don't see any entries for the L2 output to the GIC in 
> /proc/interrupts because no interrupt descriptor is allocated.
> 
> This makes me wonder where this is coming from but the message about irq 
> 10 being bad would make sense then.
> 
>>   12:     130322      26028      27670     135225     GICv2  30 Level
>> arch_timer
>>   13:          0          0          0          0     GICv2  27 Level
>> kvm guest vtimer
>>   19:          0          0          0          0     GICv2 107 Level
>> fe004000.txp
>>   20:       7450          0          0          0     GICv2  65 Level
>> fe00b880.mailbox
>>   25:       6525          0          0          0     GICv2 153 Level
>> uart-pl011
>>   26:          0          0          0          0     GICv2 149 Level
>> fe205000.i2c, fe804000.i2c
>>   27:          9          0          0          0     GICv2 125 Level
>> ttyS1
>>   28:      36999          0          0          0     GICv2 158 Level
>> mmc0, mmc1
>>   29:          1          0          0          0     GICv2 129 Level
>> vc4 hvs
>>   30:          0          0          0          0     GICv2 105 Level
>> fe980000.usb, fe980000.usb
>>   31:          0          0          0          0     GICv2 112 Level
>> DMA IRQ
>>   33:          0          0          0          0     GICv2 114 Level
>> DMA IRQ
>>   40:          0          0          0          0     GICv2 141 Level
>> vc4 crtc
>>   41:          0          0          0          0     GICv2 142 Level
>> vc4 crtc, vc4 crtc
>>   42:         10          0          0          0     GICv2 133 Level
>> vc4 crtc
>>   43:          1          0          0          0
>> interrupt-controller at 7ef00100   0 Edge      vc4 hdmi cec tx
>>   44:          0          0          0          0
>> interrupt-controller at 7ef00100   1 Edge      vc4 hdmi cec rx
>>   47:          0          0          0          0
>> interrupt-controller at 7ef00100   4 Edge      vc4 hdmi hpd connected
>>   48:          1          0          0          0
>> interrupt-controller at 7ef00100   5 Edge      vc4 hdmi hpd disconnected
>>   49:          0          0          0          0
>> interrupt-controller at 7ef00100   8 Edge      vc4 hdmi cec tx
>>   50:          0          0          0          0
>> interrupt-controller at 7ef00100   7 Edge      vc4 hdmi cec rx
>>   53:          0          0          0          0
>> interrupt-controller at 7ef00100  10 Edge      vc4 hdmi hpd connected
>>   54:          0          0          0          0
>> interrupt-controller at 7ef00100  11 Edge      vc4 hdmi hpd disconnected
>>   55:          7          0          0          0     GICv2  66 Level
>> VCHIQ doorbell
>>   56:          0          0          0          0     GICv2  48 Level
>> arm-pmu
>>   57:          0          0          0          0     GICv2  49 Level
>> arm-pmu
>>   58:          0          0          0          0     GICv2  50 Level
>> arm-pmu
>>   59:          0          0          0          0     GICv2  51 Level
>> arm-pmu
>>   62:      47599          0          0          0     GICv2 189 Level
>> eth0
>>   63:       4681          0          0          0     GICv2 190 Level
>> eth0
>>   64:          0          0          0          0     GICv2 175 Level
>> PCIe PME, aerdrv
>>   65:        326          0          0          0  BRCM STB PCIe MSI
>> 524288 Edge      xhci_hcd
>> IPI0:      2442       5185       7195      18290       Rescheduling
>> interrupts
>> IPI1:       481        383        518        533       Function call
>> interrupts
>> IPI2:         0          0          0          0       CPU stop 
>> interrupts
>> IPI3:         0          0          0          0       CPU stop (for
>> crash dump) interrupts
>> IPI4:         0          0          0          0       Timer broadcast
>> interrupts
>> IPI5:         1          0          0          0       IRQ work 
>> interrupts
>> IPI6:         0          0          0          0       CPU wake-up
>> interrupts
>> Err:          1
>>
>> Comparing the vendor & mainline DTS, i noticed differences at hdmi0/1.
>> The vendor DTS has an additional register to access the same space as
>> aon_intr (interrupt parent), which looks ugly [2].
> 
> OK, so while I do see how we set external_irq_controller to let the 
> interrupt be managed externally from vc4_hdmi.c else do it internally, I 
> don't see the code that tries to map the "intr2" register space, let 
> alone fetch its resource, do you know where that is?

Downstream appears to be setting the 'aon_intr' Device Tree node with a 
status = disabled property, still leaves me wondering whether the 
"intr2" register space is used.

I think it would be prudent to change the hpd_con, hpd_rm variables in 
vc4_hdmi.c to be signed integers instead of unsigned integers, if 
nothing else such that -EPROBE_DEFER might be returned.

Also, in 5.16, we can finally build irq-brcmstb-l2 as a module, so 
dealing with -EPROBE_DEFER might be necessary.

I am still lost as to why GIC_SPI 96 was allowed to show up as leaf 
interrupt, unless there is something modification of the Device Tree 
being used by the VPU firmware?

> 
>>
>> Additionally i noted that bcm2711.dtsi uses the compatible
>> "brcm,bcm2711-l2-intc" with a level high interrupt, but according to
>> irq-brcmstb-l2.c [3] the compatible is not defined and would fallback to
>> "brcm,l2-intc" with brcmstb_l2_edge_intc_of_init. This looks fishy.
> 
> The hardware level 2 controller at 0x7ef00100 is definitively the same 
> kind that supports edge interrupt flows since it has the CPU_STATUS, 
> CPU_CLEAR, CPU_MASK_STATUS, CPU_MASK_SET and CPU_MASK_CLEAR registers, 
> so it seems to me like declaring the interrupt controller of the kind 
> "brcm,l2-intc" and using the edge interrupt handling is not necessarily 
> a problem.



> 
>>
>> I didn't try to reproduce this with Raspberry Pi OS & mainline kernel,
>> but i hope these are enough information so far.
>>
>> [1] - https://archlinuxarm.org/forum/viewtopic.php?f=65&t=15791
>>
>> [2] -
>> https://github.com/raspberrypi/linux/blob/rpi-5.15.y/arch/arm/boot/dts/bcm2711.dtsi#L339 
>>
>>
>> [3] -
>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/irqchip/irq-brcmstb-l2.c?h=v5.15.15#n278 
>>
>>
> 

-- 
Florian



More information about the linux-arm-kernel mailing list