[PATCH] nvme-pci: Harden drive presence detect in nvme_dev_disable()

Keith Busch kbusch at kernel.org
Thu May 12 19:00:11 PDT 2022


On Fri, May 13, 2022 at 04:07:18AM +0300, Max Gurtovoy wrote:
> On 5/12/2022 5:33 PM, Keith Busch wrote:
> > It might happen if the device requires a Subsystem Reset to activate the new
> > firmware.
> 
> but isn't the reset something that the admin should do ?
> 
> some power-cycle or cold-reboot or other reset.
> 
> In the reported case the device resets itself. I'm not sure it's expected.

I'm not familiar with this device or the procedure used to update the firmware,
but I'm aware many vendors still provide their firmware bundled within their
own tooling, so simply running that utility could do all sorts of things
including a device link reset.

If the device is resetting itself without a user or driver app initiating it,
though, that would be bad behavior outside the spec. But if this harmless patch
improves interop regardless of device behavior, then it should be okay to
include.

> BTW, for sure the fix is good for the hot unplug case but FW reset shouldn't
> cause this scenario IMO.

Yeah, the patch is fine as far as I'm concerned, though it is impossible to
prevent all register reads executing concurrently with any random link-down
event. The commit log indicates reading registers during such an event causes a
kernel crash, so a more complete fix probably needs to come from the platform
level.



More information about the Linux-nvme mailing list