[PATCH 2/2] PCI: brcmstb: Add panic/die handler to driver
Bjorn Helgaas
helgaas at kernel.org
Wed Aug 6 11:15:40 PDT 2025
On Fri, Jun 13, 2025 at 06:08:43PM -0400, Jim Quinlan wrote:
> Whereas most PCIe HW returns 0xffffffff on illegal accesses and the like,
> by default Broadcom's STB PCIe controller effects an abort. Some SoCs --
> 7216 and its descendants -- have new HW that identifies error details.
What's the long term plan for this? This abort is a huge problem that
we're seeing across arm64 platforms. Forcing a panic and reboot for
every uncorrectable error is pretty hard to deal with.
Is there a plan to someday recover from these aborts? Or change the
hardware so it can at least be configured to return ~0 data after
logging the error in the hardware registers?
> This simple handler determines if the PCIe controller was the cause of the
> abort and if so, prints out diagnostic info. Unfortunately, an abort still
> occurs.
>
> Care is taken to read the error registers only when the PCIe bridge is
> active and the PCIe registers are acceptable. Otherwise, a "die" event
> caused by something other than the PCIe could cause an abort if the PCIe
> "die" handler tried to access registers when the bridge is off.
Checking whether the bridge is active is a "mostly-works" situation
since it's always racy.
> Example error output:
> brcm-pcie 8b20000.pcie: Error: Mem Acc: 32bit, Read, @0x38000000
> brcm-pcie 8b20000.pcie: Type: TO=0 Abt=0 UnspReq=1 AccDsble=0 BadAddr=0
More information about the linux-arm-kernel
mailing list