[PATCH 1/2] nvme: pci: simplify timeout handling
Ming Lei
ming.lei at redhat.com
Thu Apr 26 09:16:23 PDT 2018
On Thu, Apr 26, 2018 at 11:57:22PM +0800, Ming Lei wrote:
> Hi Jianchao,
>
> On Thu, Apr 26, 2018 at 11:07:56PM +0800, jianchao.wang wrote:
> > Hi Ming
> >
> > Thanks for your wonderful solution. :)
> >
> > On 04/26/2018 08:39 PM, Ming Lei wrote:
> > > +/*
> > > + * This one is called after queues are quiesced, and no in-fligh timeout
> > > + * and nvme interrupt handling.
> > > + */
> > > +static void nvme_pci_cancel_request(struct request *req, void *data,
> > > + bool reserved)
> > > +{
> > > + /* make sure timed-out requests are covered too */
> > > + if (req->rq_flags & RQF_MQ_TIMEOUT_EXPIRED) {
> > > + req->aborted_gstate = 0;
> > > + req->rq_flags &= ~RQF_MQ_TIMEOUT_EXPIRED;
> > > + }
> > > +
> > > + nvme_cancel_request(req, data, reserved);
> > > +}
> > > +
> > > static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
> > > {
> > > int i;
> > > @@ -2223,10 +2316,17 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
> > > for (i = dev->ctrl.queue_count - 1; i >= 0; i--)
> > > nvme_suspend_queue(&dev->queues[i]);
> > >
> > > + /*
> > > + * safe to sync timeout after queues are quiesced, then all
> > > + * requests(include the time-out ones) will be canceled.
> > > + */
> > > + nvme_sync_queues(&dev->ctrl);
> > > + blk_sync_queue(dev->ctrl.admin_q);
> > > +
> > Looks like blk_sync_queue cannot drain all the timeout work.
> >
> > blk_sync_queue
> > -> del_timer_sync
> > blk_mq_timeout_work
> > -> mod_timer
> > -> cancel_work_sync
> > the timeout work may come back again.
> > we may need to force all the in-flight requests to be timed out with blk_abort_request
> >
>
> blk_abort_request() seems over-kill, we could avoid this race simply by
> returning EH_NOT_HANDLED if the controller is in-recovery.
>
> > > nvme_pci_disable(dev);
> >
> > the interrupt will not come, but there maybe running one.
> > a synchronize_sched() here ?
>
> We may cover this case by moving nvme_suspend_queue() before
> nvme_stop_queues().
Looks not need the above change, because pci_free_irq() called
from nvme_suspend_queue() won't return until any executing interrupts
for this IRQ have completed.
Thanks,
Ming
More information about the Linux-nvme
mailing list