[PATCH 1/2] nvme: pci: simplify timeout handling
jianchao.wang
jianchao.w.wang at oracle.com
Thu Apr 26 18:37:06 PDT 2018
On 04/26/2018 11:57 PM, Ming Lei wrote:
> Hi Jianchao,
>
> On Thu, Apr 26, 2018 at 11:07:56PM +0800, jianchao.wang wrote:
>> Hi Ming
>>
>> Thanks for your wonderful solution. :)
>>
>> On 04/26/2018 08:39 PM, Ming Lei wrote:
>>> +/*
>>> + * This one is called after queues are quiesced, and no in-fligh timeout
>>> + * and nvme interrupt handling.
>>> + */
>>> +static void nvme_pci_cancel_request(struct request *req, void *data,
>>> + bool reserved)
>>> +{
>>> + /* make sure timed-out requests are covered too */
>>> + if (req->rq_flags & RQF_MQ_TIMEOUT_EXPIRED) {
>>> + req->aborted_gstate = 0;
>>> + req->rq_flags &= ~RQF_MQ_TIMEOUT_EXPIRED;
>>> + }
>>> +
>>> + nvme_cancel_request(req, data, reserved);
>>> +}
>>> +
>>> static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
>>> {
>>> int i;
>>> @@ -2223,10 +2316,17 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
>>> for (i = dev->ctrl.queue_count - 1; i >= 0; i--)
>>> nvme_suspend_queue(&dev->queues[i]);
>>>
>>> + /*
>>> + * safe to sync timeout after queues are quiesced, then all
>>> + * requests(include the time-out ones) will be canceled.
>>> + */
>>> + nvme_sync_queues(&dev->ctrl);
>>> + blk_sync_queue(dev->ctrl.admin_q);
>>> +
>> Looks like blk_sync_queue cannot drain all the timeout work.
>>
>> blk_sync_queue
>> -> del_timer_sync
>> blk_mq_timeout_work
>> -> mod_timer
>> -> cancel_work_sync
>> the timeout work may come back again.
>> we may need to force all the in-flight requests to be timed out with blk_abort_request
>>
>
> blk_abort_request() seems over-kill, we could avoid this race simply by
> returning EH_NOT_HANDLED if the controller is in-recovery.
return EH_NOT_HANDLED maybe not enough.
please consider the following scenario.
nvme_error_handler
-> nvme_dev_disable
-> blk_sync_queue
//timeout comes again due to the
//scenario above
blk_mq_timeout_work
-> blk_mq_check_expired
-> set aborted_gstate
-> nvme_pci_cancel_request
-> RQF_MQ_TIMEOUT_EXPIRED has not been set
-> nvme_cancel_request
-> blk_mq_complete_request
-> do nothing
-> blk_mq_ternimate_expired
-> blk_mq_rq_timed_out
-> set RQF_MQ_TIMEOUT_EXPIRED
-> .timeout return BLK_EH_NOT_HANDLED
Then the timeout request is leaked.
>
>>> nvme_pci_disable(dev);
>>
>> the interrupt will not come, but there maybe running one.
>> a synchronize_sched() here ?
>
> We may cover this case by moving nvme_suspend_queue() before
> nvme_stop_queues().
>
> Both two are very good catch, thanks!
>
More information about the Linux-nvme
mailing list