Possible regression between 4.9 and 4.13
Mason
slash.tmp at free.fr
Wed Aug 23 02:18:36 PDT 2017
On 23/08/2017 09:51, Mathias Nyman wrote:
> On 23.08.2017 09:07, Felipe Balbi wrote:
>
>> Mason writes:
>>
>>> Any idea what could have changed between 4.9 and 4.13 ?
>>
>> Quite a bit:
>>
>> $ git rev-list --no-merges --count v4.13-rc6 ^v4.9 -- drivers/usb/host/xhci drivers/usb/core/
>> 58
>
> very likely cause is the more aggressive detection of pci removed xhci hosts
>
> See commit d9f11ba9f107aa335091ab8d7ba5eea714e46e8b
> xhci: Rework how we handle unresponsive or hoptlug removed hosts
>
> It checks if a xhci register reads returns 0xffffffff and assumes xhci
> died in that case.
>
> Could you add something like the below to check which what is killing the host?
> Or a BUG()/WARN() in xhci_hc_died() to get a backtrace of who called it.
>
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 51cd4b8..ade2ad6 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -922,7 +922,8 @@ void xhci_hc_died(struct xhci_hcd *xhci)
> if (xhci->xhc_state & XHCI_STATE_DYING)
> return;
>
> - xhci_err(xhci, "xHCI host controller not responding, assume dead\n");
> + xhci_err(xhci, "xHC not responding in %pf, assume controller is dead\n",
> + __builtin_return_address(0));
> xhci->xhc_state |= XHCI_STATE_DYING;
>
> xhci_cleanup_command_queue(xhci);
I'll try some coarse bisection to narrow it down.
$ git describe --contains d9f11ba9f107aa335091ab8d7ba5eea714e46e8b
v4.12-rc1~97^2~39
I'll check 4.11 first.
I wanted to mention that the XHCI setup on 4.9 and 4.13 print
slightly different things (at the beginning).
On 4.9
[ 1.240322] xhci_hcd 0000:01:00.0: xHCI Host Controller
[ 1.245617] xhci_hcd 0000:01:00.0: new USB bus registered, assigned bus number 1
[ 1.258691] xhci_hcd 0000:01:00.0: hcc params 0x014051cf hci version 0x100 quirks 0x00000010
[ 1.268090] hub 1-0:1.0: USB hub found
[ 1.271905] hub 1-0:1.0: 4 ports detected
[ 1.276372] xhci_hcd 0000:01:00.0: xHCI Host Controller
[ 1.281645] xhci_hcd 0000:01:00.0: new USB bus registered, assigned bus number 2
[ 1.289173] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM.
[ 1.297775] hub 2-0:1.0: USB hub found
[ 1.301577] hub 2-0:1.0: 4 ports detected
[ 1.306194] usbcore: registered new interface driver usb-storage
On 4.13
[ 1.222471] pcieport 0000:00:00.0: of_irq_parse_pci: failed with rc=-22
[ 1.229156] xhci_hcd 0000:01:00.0: Resetting
[ 2.268836] xhci_hcd 0000:01:00.0: xHCI Host Controller
[ 2.274126] xhci_hcd 0000:01:00.0: new USB bus registered, assigned bus number 1
[ 2.287222] xhci_hcd 0000:01:00.0: hcc params 0x014051cf hci version 0x100 quirks 0x00000010
[ 2.296653] hub 1-0:1.0: USB hub found
[ 2.300478] hub 1-0:1.0: 4 ports detected
[ 2.304962] xhci_hcd 0000:01:00.0: xHCI Host Controller
[ 2.310246] xhci_hcd 0000:01:00.0: new USB bus registered, assigned bus number 2
[ 2.317776] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM.
[ 2.326419] hub 2-0:1.0: USB hub found
[ 2.330229] hub 2-0:1.0: 4 ports detected
[ 2.334869] usbcore: registered new interface driver usb-storage
FWIW, "of_irq_parse_pci: failed with rc=-22"
seems to come from:
[ 1.257411] [<c03d80c8>] (of_irq_parse_pci) from [<c03d8270>] (of_irq_parse_and_map_pci+0x10/0x2c)
[ 1.266420] [<c03d8270>] (of_irq_parse_and_map_pci) from [<c03100a8>] (pci_assign_irq+0x78/0xb0)
[ 1.275254] [<c03100a8>] (pci_assign_irq) from [<c030a1c8>] (pci_device_probe+0x18/0x128)
[ 1.283476] [<c030a1c8>] (pci_device_probe) from [<c0357864>] (driver_probe_device+0x244/0x2c8)
The error logging was added by f1aa54840657f
No, that just turned one specific error into a warning.
Need to dig a bit more.
Regards.
More information about the linux-arm-kernel
mailing list