Enumerating an empty bus hangs the entire system

Fri Apr 7 16:21:02 EDT 2017

On Fri, Apr 07, 2017 at 05:15:32PM +0200, Mason wrote:
> On 15/03/2017 16:25, Mason wrote:
> 
> > My driver works reasonably well on revision 1 of the PCIe controller.
> > (For lax enough values of "reasonably well"...)
> > 
> > So I wanted to try it out on revision 2 of the controller.
> > 
> > Turns out the system hangs if I boot with no card inserted in the PCIe
> > slot. (This does not happen on revision 1.) If I log all config space
> > accesses, this is what I see:
> > 
> > ...
> > [    2.966402] tango_config_read: bus=0 devfn=0 where=128 size=2
> > [    2.972284] tango_config_read: bus=0 devfn=0 where=140 size=4
> > [    2.978167] tango_config_read: bus=0 devfn=0 where=146 size=2
> > [    2.984144] pci_bus 0000:01: busn_res: can not insert [bus 01-ff] under [bus 00-3f] (conflicts with (null) [bus 00-3f])
> > [    2.995105] tango_config_write: bus=0 devfn=0 where=24 size=4 val=0xff0100
> > [    3.002134] pci_bus 0000:01: scanning bus
> > [    3.006274] tango_config_read: bus=1 devfn=0 where=0 size=4
> > 
> > Basically, the PCI framework tries to read vendor and device IDs
> > of the non-existent device on bus 1, which hangs the system,
> > because the read never completes :-(
> > 
> > I had the same problem with the legacy driver for 3.4 but I was
> > hoping I would magically avoid it in a recent kernel.
> > 
> > The only work-around I see is: assuming the first access to a
> > bus will be to register 0, check the PHY for an active link
> > before sending an actual read request to register 0.
> > 
> > Is that reasonable?
> > 
> > Is it compliant for the PCIe controller to hang like that,
> > or should it handle some kind of time out?
> > 
> > Liviu suggested: "The PCIe controller probably generates (or propagates)
> > a bus abort that it should actually trap in HW. Check if there is a SW
> > configurable way to recover that."
> > 
> > But I unmasked all system/misc errors, and I don't see any
> > interrupts firing.
> 
> I now have a better understanding of the situation, which inevitably
> leads to more questions...
> 
> By reading a controller-specific debug register, I sampled the LTSSM
> (Link Training and Status State-Machine) value as fast as possible.
> 
> A) if there is no card inserted in the PCIe slot, the State-Machine
> oscillates between "Detect.Quiet" and "Detect.Active" SubStates of
> the "Detect" State.
> 
> B) if there is a card inserted in the PCIe slot, then after a few
> milliseconds, the State-Machine changes to "Polling.Active", then
> "Polling.Configuration", then "Configuration" (this step must be
> very short, because I don't see it consistently), then "L0".
> 
> 
> One issue I noted in a separate message is that, on rev1 of my HW,
> if the PCIe framework tries to read the card's device ID too soon,
> i.e. before link training is complete, then the read returns ~0,
> and the framework immediately gives up.
> 
> Looking at pci_bus_read_dev_vendor_id(), I see that there is a
> retry mechanism implemented, but it seems to be a quirk?

Configuration Request Retry is not a quirk; it's a standard part of
PCIe (see PCIe r3.1, sec 7.8.13), and pci_enable_crs() enables it
whenever a Root Port claims to support it.

> Does the framework expect pci_bus_read_dev_vendor_id() to always
> succeed when there is indeed a device on that specific bus?

Yes.

> In that cas, my driver needs to take care of only starting enumeration
> once the link to the PCIe card is really "up" and functional?

Yes.  Most of the drivers in drivers/pci/host/ have a
*_wait_for_link() function that does this.  I would start by copying
that style.

> I was given the advice to move the link detection code to the
> probe function, and reset the host bridge (to save power) when
> no link is detected after some time. What do you think?

I would copy what the other drivers do.  After you have a stable,
working driver, you can worry about power.

Bjorn