Possible regression between 4.9 and 4.13

Mason slash.tmp at free.fr
Wed Aug 30 01:55:37 PDT 2017


On 30/08/2017 08:02, Greg Kroah-Hartman wrote:

> To get back to the original issue here, the hardware seems to have died,
> the driver stops talking to it, and all is good.  The "regression" here
> is that we now properly can determine that the hardware is crap.

Before 4.12, when I unplugged my USB3 Flash drive, Linux would
detect a few "Uncorrected Non-Fatal errors" via AER, but it was
still possible to plug the drive back in.

Since 4.12, once I unplug the drive, the whole USB3 card is marked
as dead (all 4 ports), and I can no longer plug anything in (not even
the USB2 drive that didn't have any issues, IIRC).

It seems a bit premature to "mark as dead" something that remains
functional, doesn't it?

Disclaimer, there are many variables in this setup, and I've only
tested a small fraction of the problem space: only one system,
only one USB3 board, only one USB3 Flash drive.

> So, how do you think we should proceed, delay a bit longer before saying
> the device is gone?  How long is "long enough"?  How many bus errors are
> we allowed to tolerate (hint, the PCI spec says none...)
> 
> Maybe someone wants to get to the root problem here, why is the hardware
> suddenly reporting all 1s?

I'm afraid I won't be able to make any progress on this front,
unless I can get my hands on a PCIe packet analyzer.

Regards.



More information about the linux-arm-kernel mailing list