[PATCH] nvme-pci: let platform handle subsystem reset fallout

Mon Jun 24 09:15:58 PDT 2024

On Mon, Jun 24, 2024 at 09:07:56AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch at kernel.org>
> 
> Scheduling reset_work after a nvme subsystem reset is expected to fail,
> but this also prevents potential handling the platform may provide from
> successfully recovering the link without re-enumeration. Provide a pci
> specific operation that safely initiates a subsystem reset, and instead
> of scheduling reset work, read back the status register to trigger a
> pcie read error.

What does platform mean here?

> @@ -653,6 +654,8 @@ static inline int nvme_reset_subsystem(struct nvme_ctrl *ctrl)
>  
>  	if (!ctrl->subsystem)
>  		return -ENOTTY;
> +	if (ctrl->ops->subsystem_reset)
> +		return ctrl->ops->subsystem_reset(ctrl);
>  	if (!nvme_wait_reset(ctrl))
>  		return -EBUSY;

Branching out into a method but having the default inline here
without any comment explaining it feels weird.

If apple nvme devices supported subsystems resets, the PCIe version
would probably the right thing to do for them as well, but my guess
is they don't anyway.  So maybe return -ENOTTY if no method is
weird up, and turn the old generic one into nvmf_subsystem_reset in
fabrics.c?

> +	/*
> +	 * Taking the shutdown_lock ensures the iomap is not being altered by

s/iomap/BAR mapping/ ?

> +	writel(0x4E564D65, dev->bar + NVME_REG_NSSR);

And now that we're duplicating this constant it could really use a
symbolic name.

> +	/* Read back to trigger platform error handling, if any */
> +	readl(dev->bar + NVME_REG_CSTS);

.. also to flush the posted write above.