[PATCH] nvme: fix hang in remove path

Rakesh Pandit rakesh at tuxera.com
Mon Jun 5 12:45:32 PDT 2017


On Mon, Jun 05, 2017 at 11:47:33AM +0300, Rakesh Pandit wrote:
> On Mon, Jun 05, 2017 at 11:44:34AM +0300, Rakesh Pandit wrote:
> > On Sun, Jun 04, 2017 at 06:24:09PM +0300, Sagi Grimberg wrote:
> > > 
> > > > It would make sense to still add:
> > > > 
> > > > if (ctrl->state == NVME_CTRL_DELETING || ctrl->state == NVME_CTRL_DEAD)
> > > > 	return
> > > > 
> > > > inside nvme_configure_apst at the top irrespective of this change.
> > > 
> > > I'm not sure what is the value given that it is taken care of in
> > > .queue_rq?
> > 
> > We would avoid getting error message which says: "failed to set APST
> > feature 7".  Why an error if controller is already under reset.
> 
> I meant deletion.  Of course not a huge value but worth a fix IMHO
> while we are at it.
> 
> > 
> > Note 7 here is NVME_SC_ABORT_REQ.  Also we would avoid walking through
> > all power states inside the nvme_configure_apst as
> > nvme_set_latency_tolerance was called with value
> > PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1) which sets
> > ctrl->ps_max_latency_us to U64_MAX and tries to send a sync command
> > which of course fails with error message.

Even though this change from this patch does fix the hang, just tested
again and I can see above error message "failed to set APST feature 7"
while nvme_remove PID is getting executed.

So, sync requests (while nvme_remove is executing) are going through
and not everything is handled well in .queue_rq while controller is
under deleting state or dead state.



More information about the Linux-nvme mailing list