[PATCH 9/9] [RFC] nvme: Fix a race condition
Bart Van Assche
bart.vanassche at sandisk.com
Tue Sep 27 10:09:20 PDT 2016
On 09/27/2016 09:56 AM, James Bottomley wrote:
> On Tue, 2016-09-27 at 09:43 -0700, Bart Van Assche wrote:
>> On 09/27/2016 09:31 AM, Steve Wise wrote:
>>>> @@ -2079,11 +2075,15 @@ EXPORT_SYMBOL_GPL(nvme_kill_queues);
>>>> void nvme_stop_queues(struct nvme_ctrl *ctrl)
>>>> {
>>>> struct nvme_ns *ns;
>>>> + struct request_queue *q;
>>>>
>>>> mutex_lock(&ctrl->namespaces_mutex);
>>>> list_for_each_entry(ns, &ctrl->namespaces, list) {
>>>> - blk_mq_cancel_requeue_work(ns->queue);
>>>> - blk_mq_stop_hw_queues(ns->queue);
>>>> + q = ns->queue;
>>>> + blk_quiesce_queue(q);
>>>> + blk_mq_cancel_requeue_work(q);
>>>> + blk_mq_stop_hw_queues(q);
>>>> + blk_resume_queue(q);
>>>> }
>>>> mutex_unlock(&ctrl->namespaces_mutex);
>>>
>>> Hey Bart, should nvme_stop_queues() really be resuming the blk
>>> queue?
>>
>> Hello Steve,
>>
>> Would you perhaps prefer that blk_resume_queue(q) is called from
>> nvme_start_queues()? I think that would make the NVMe code harder to
>> review. The above code won't cause any unexpected side effects if an
>> NVMe namespace is removed after nvme_stop_queues() has been called
>> and before nvme_start_queues() is called. Moving the
>> blk_resume_queue(q) call into nvme_start_queues() will only work as
>> expected if no namespaces are added nor removed between the
>> nvme_stop_queues() and nvme_start_queues() calls. I'm not familiar
>> enough with the NVMe code to know whether or not this change is safe
>> ...
>
> It's something that looks obviously wrong, so explain why you need to
> do it, preferably in a comment above the function.
Hello James and Steve,
I will add a comment.
Please note that the above patch does not change the behavior of
nvme_stop_queues() except that it causes nvme_stop_queues() to wait
until any ongoing nvme_queue_rq() calls have finished.
blk_resume_queue() does not affect the value of the BLK_MQ_S_STOPPED bit
that has been set by blk_mq_stop_hw_queues(). All it does is to resume
pending blk_queue_enter() calls and to ensure that future
blk_queue_enter() calls do not block. Even after blk_resume_queue() has
been called if a new request is queued queue_rq() won't be invoked
because the BLK_MQ_S_STOPPED bit is still set. Patch "dm: Fix a race
condition related to stopping and starting queues" realizes a similar
change in the dm driver and that change has been tested extensively.
Bart.
More information about the Linux-nvme
mailing list