nvme-pci: try function level reset on init failure

Nitesh Shetty nj.shetty at samsung.com
Thu Jul 17 01:38:34 PDT 2025


On 15/07/25 12:16PM, Keith Busch wrote:
>From: Keith Busch <kbusch at kernel.org>
>
>NVMe devices from multiple vendors appear to get stuck in a reset state
>that we can't get out of with an NVMe level Controller Reset. The kernel
>would report these with messages that look like:
>
>  Device not ready; aborting reset, CSTS=0x1
>
>These have historically required a power cycle to make them usable
>again, but in many cases, a PCIe FLR is sufficient to restart operation
>without a power cycle. Try it if the initial controller reset fails
>during any nvme reset attempt.
>
>Cc: Chaitanya Kulkarni <chaitanyak at nvidia.com>
>Signed-off-by: Keith Busch <kbusch at kernel.org>
>---
>v1->v2:
>
>  Added code comment explaining whe escalation
>
>  Add an informational kernel message that this event occured
>
>  Use the "pcie_reset_flr()" API instead of "pcie_flr()" since that one
>  checks for quirks and capabilities before writing FLR config bits.
>  Note, NVMe PCI Trasnsport Spec mandates FLR capability, so the latter
>  should not apply to any compliant device, but you never know...
>
> drivers/nvme/host/pci.c | 24 ++++++++++++++++++++++--
> 1 file changed, 22 insertions(+), 2 deletions(-)
>

Reviewed-by: Nitesh Shetty <nj.shetty at samsung.com>


More information about the Linux-nvme mailing list