[PATCH 9/9] [RFC] nvme: Fix a race condition

Bart Van Assche bart.vanassche at sandisk.com
Tue Sep 27 09:43:59 PDT 2016


On 09/27/2016 09:31 AM, Steve Wise wrote:
>> @@ -2079,11 +2075,15 @@ EXPORT_SYMBOL_GPL(nvme_kill_queues);
>>  void nvme_stop_queues(struct nvme_ctrl *ctrl)
>>  {
>>  	struct nvme_ns *ns;
>> +	struct request_queue *q;
>>
>>  	mutex_lock(&ctrl->namespaces_mutex);
>>  	list_for_each_entry(ns, &ctrl->namespaces, list) {
>> -		blk_mq_cancel_requeue_work(ns->queue);
>> -		blk_mq_stop_hw_queues(ns->queue);
>> +		q = ns->queue;
>> +		blk_quiesce_queue(q);
>> +		blk_mq_cancel_requeue_work(q);
>> +		blk_mq_stop_hw_queues(q);
>> +		blk_resume_queue(q);
>>  	}
>>  	mutex_unlock(&ctrl->namespaces_mutex);
>
> Hey Bart, should nvme_stop_queues() really be resuming the blk queue?

Hello Steve,

Would you perhaps prefer that blk_resume_queue(q) is called from 
nvme_start_queues()? I think that would make the NVMe code harder to 
review. The above code won't cause any unexpected side effects if an 
NVMe namespace is removed after nvme_stop_queues() has been called and 
before nvme_start_queues() is called. Moving the blk_resume_queue(q) 
call into nvme_start_queues() will only work as expected if no 
namespaces are added nor removed between the nvme_stop_queues() and 
nvme_start_queues() calls. I'm not familiar enough with the NVMe code to 
know whether or not this change is safe ...

Bart.



More information about the Linux-nvme mailing list