Enumerating an empty bus hangs the entire system
Liviu Dudau
liviu.dudau at arm.com
Wed Mar 15 08:55:11 PDT 2017
On Wed, Mar 15, 2017 at 04:25:44PM +0100, Mason wrote:
> Hello,
Hi Mason,
>
> My driver works reasonably well on revision 1 of the PCIe controller.
> (For lax enough values of "reasonably well"...)
>
> So I wanted to try it out on revision 2 of the controller.
>
> Turns out the system hangs if I boot with no card inserted in the PCIe
> slot. (This does not happen on revision 1.) If I log all config space
> accesses, this is what I see:
>
> ...
> [ 2.966402] tango_config_read: bus=0 devfn=0 where=128 size=2
> [ 2.972284] tango_config_read: bus=0 devfn=0 where=140 size=4
> [ 2.978167] tango_config_read: bus=0 devfn=0 where=146 size=2
> [ 2.984144] pci_bus 0000:01: busn_res: can not insert [bus 01-ff] under [bus 00-3f] (conflicts with (null) [bus 00-3f])
> [ 2.995105] tango_config_write: bus=0 devfn=0 where=24 size=4 val=0xff0100
> [ 3.002134] pci_bus 0000:01: scanning bus
> [ 3.006274] tango_config_read: bus=1 devfn=0 where=0 size=4
>
> Basically, the PCI framework tries to read vendor and device IDs
> of the non-existent device on bus 1, which hangs the system,
> because the read never completes :-(
>
> I had the same problem with the legacy driver for 3.4 but I was
> hoping I would magically avoid it in a recent kernel.
>
> The only work-around I see is: assuming the first access to a
> bus will be to register 0, check the PHY for an active link
> before sending an actual read request to register 0.
>
> Is that reasonable?
>
> Is it compliant for the PCIe controller to hang like that,
> or should it handle some kind of time out?
AFAIK it is not. The PCI (and PCIe) host controller should give back 0xffffffff
as a return of a config read from any bus/dev/fn combination where nothing
exists.
>
> Liviu suggested: "The PCIe controller probably generates (or propagates)
> a bus abort that it should actually trap in HW. Check if there is a SW
> configurable way to recover that."
>
> But I unmasked all system/misc errors, and I don't see any
> interrupts firing.
Have a look at drivers/pci/dwc/pci-imx6.c file where in imx6_pcie_probe() function
they hook an abort handler. See if doing the same thing helps you.
Best regards,
Liviu
>
> Regards.
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
More information about the linux-arm-kernel
mailing list