nvme-pci: try function level reset on init failure
Nitesh Shetty
nj.shetty at samsung.com
Thu Jul 17 01:38:34 PDT 2025
On 15/07/25 12:16PM, Keith Busch wrote:
>From: Keith Busch <kbusch at kernel.org>
>
>NVMe devices from multiple vendors appear to get stuck in a reset state
>that we can't get out of with an NVMe level Controller Reset. The kernel
>would report these with messages that look like:
>
> Device not ready; aborting reset, CSTS=0x1
>
>These have historically required a power cycle to make them usable
>again, but in many cases, a PCIe FLR is sufficient to restart operation
>without a power cycle. Try it if the initial controller reset fails
>during any nvme reset attempt.
>
>Cc: Chaitanya Kulkarni <chaitanyak at nvidia.com>
>Signed-off-by: Keith Busch <kbusch at kernel.org>
>---
>v1->v2:
>
> Added code comment explaining whe escalation
>
> Add an informational kernel message that this event occured
>
> Use the "pcie_reset_flr()" API instead of "pcie_flr()" since that one
> checks for quirks and capabilities before writing FLR config bits.
> Note, NVMe PCI Trasnsport Spec mandates FLR capability, so the latter
> should not apply to any compliant device, but you never know...
>
> drivers/nvme/host/pci.c | 24 ++++++++++++++++++++++--
> 1 file changed, 22 insertions(+), 2 deletions(-)
>
Reviewed-by: Nitesh Shetty <nj.shetty at samsung.com>
More information about the Linux-nvme
mailing list