pciehp 0000:00:1c.0:pcie004: Timeout on hotplug command 0x1038 (issued 65284 msec ago)
Bjorn Helgaas
helgaas at kernel.org
Mon Apr 30 14:17:40 PDT 2018
On Mon, Apr 30, 2018 at 04:48:15PM -0400, Sinan Kaya wrote:
> Bjorn,
>
> On 4/28/2018 9:03 AM, okaya at codeaurora.org wrote:
> >> Hmm, if it is the remove() method then kexec does not use it. kexec use
> >> the shutdown() method instead. I missed this details when I replied.
> >
> > Portdrv hooks up remove handler to shutdown. That's why remove is getting called.
>
> What should we do about this?
>
> Since there is an actual HW errata involved, should we quirk this
> root port and not wait as if remove/shutdown doesn't exist?
I was hoping to avoid a quirk because AFAIK all Intel parts have this
issue so it will be an ongoing maintenance issue. I tried to avoid
the timeout delays, e.g., with 40b960831cfa ("PCI: pciehp: Compute
timeout from hotplug command start time").
But we still see the alarming messages, so we should probably add a
quirk to get rid of those.
But I haven't given up on the idea of getting rid of the
pciehp_remove() path. I'm not convinced yet that we actually need to
do anything to shut this device down. I don't like the assumption
that kexec requires this. The kexec is fundamentally just a branch,
and anything we do before the branch (i.e., in the old kernel), we
should also be able to do after the branch (i.e., in the kexec-ed
kernel).
> Paul,
> You might want to file a bugzilla so that we can keep our debug
> efforts out of this list.
More information about the kexec
mailing list