[PATCH] nvme: fix hang in remove path
Rakesh Pandit
rakesh at tuxera.com
Mon Jun 5 12:45:32 PDT 2017
On Mon, Jun 05, 2017 at 11:47:33AM +0300, Rakesh Pandit wrote:
> On Mon, Jun 05, 2017 at 11:44:34AM +0300, Rakesh Pandit wrote:
> > On Sun, Jun 04, 2017 at 06:24:09PM +0300, Sagi Grimberg wrote:
> > >
> > > > It would make sense to still add:
> > > >
> > > > if (ctrl->state == NVME_CTRL_DELETING || ctrl->state == NVME_CTRL_DEAD)
> > > > return
> > > >
> > > > inside nvme_configure_apst at the top irrespective of this change.
> > >
> > > I'm not sure what is the value given that it is taken care of in
> > > .queue_rq?
> >
> > We would avoid getting error message which says: "failed to set APST
> > feature 7". Why an error if controller is already under reset.
>
> I meant deletion. Of course not a huge value but worth a fix IMHO
> while we are at it.
>
> >
> > Note 7 here is NVME_SC_ABORT_REQ. Also we would avoid walking through
> > all power states inside the nvme_configure_apst as
> > nvme_set_latency_tolerance was called with value
> > PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1) which sets
> > ctrl->ps_max_latency_us to U64_MAX and tries to send a sync command
> > which of course fails with error message.
Even though this change from this patch does fix the hang, just tested
again and I can see above error message "failed to set APST feature 7"
while nvme_remove PID is getting executed.
So, sync requests (while nvme_remove is executing) are going through
and not everything is handled well in .queue_rq while controller is
under deleting state or dead state.
More information about the Linux-nvme
mailing list