[PATCH v3 2/2] nvme: handle persistent internal error AER from NVMe controller

Michael Kelley (LINUX) mikelley at microsoft.com
Tue Jun 7 20:59:00 PDT 2022


From: Christoph Hellwig <hch at lst.de> Sent: Tuesday, June 7, 2022 3:36 AM
> 
> On Mon, Jun 06, 2022 at 05:15:15PM -0700, Michael Kelley wrote:
> > +static void nvme_handle_aer_persistent_error(struct nvme_ctrl *ctrl)
> > +{
> > +	trace_nvme_async_event(ctrl, NVME_AER_ERROR);
> > +
> > +	/*
> > +	 * We can't read the CSTS here because we're in an atomic context on
> > +	 * some transports and the read may require submitting a request to the
> > +	 * to the controller and getting a response. Such a sequence isn't
> > +	 * likely to be successful anyway if the controller is reporting a
> > +	 * persistent internal error. So assume CSTS.CFS is set.
> > +	 */
> > +	if (nvme_should_reset(ctrl, NVME_CSTS_CFS)) {
> > +		dev_warn(ctrl->device, "resetting controller due to AER\n");
> > +		nvme_reset_ctrl(ctrl);
> 
> I don't think we even need the nvme_should_reset check now.
> 
> nvme_reset_ctrl first calls nvme_change_ctrl_state, which only allows
> the transition to the RESETTING state if it previously was NEW or LIVE,
> so we are already covered.  The only downside would be an extra kernel
> message if we already were in another state.

OK, I agree.  Patch 1/2 can be dropped since there's now no need to
move nvme_should_reset(), and patch 2 is simplified even further.

I'll do a v4.

Michael



More information about the Linux-nvme mailing list