Why do we check for "link-up" in *_pcie_valid_device()?

Bjorn Helgaas helgaas at kernel.org
Fri Dec 22 09:28:15 PST 2017


On Fri, Dec 22, 2017 at 01:02:28PM +0000, Bharat Kumar Gogada wrote:
> Bjorn wrote:
>> In the PCI config access path, the *_pcie_valid_device() functions
>> in the dwc, altera, rockchip, and xilinx drivers all check whether
>> the link is up.
>> 
>> I think this is racy because the link may go down after we check but
>> before we perform the config access.
>> 
>> What would blow up if we removed the *_pcie_link_up() checks?
>> 
>> I'd like to either remove the checks or add comments about why the
>> race is acceptable.  If we've covered this before, I apologize.
>> Adding a comment will keep me from pestering you about this again in
>> the future.

> In both Xilinx driver cases when link is down, hardware responds by
> AXI DECERR/SLVERR status which causes an exception, synchronous
> external abort to CPU.  This causes system to hang, so we need this
> check for both of our drivers.  We will add comments. 

This is a problem, and checking whether the link is up is a workaround
but not a real solution.  That means your system may hang if the link
happens to go down at the wrong time.

A real solution would be to handle the synchronous external abort
so it doesn't cause a system hang.

Bjorn



More information about the Linux-rockchip mailing list