[PATCH 9/9] [RFC] nvme: Fix a race condition

James Bottomley jejb at linux.vnet.ibm.com
Tue Sep 27 09:56:00 PDT 2016


On Tue, 2016-09-27 at 09:43 -0700, Bart Van Assche wrote:
> On 09/27/2016 09:31 AM, Steve Wise wrote:
> > > @@ -2079,11 +2075,15 @@ EXPORT_SYMBOL_GPL(nvme_kill_queues);
> > >  void nvme_stop_queues(struct nvme_ctrl *ctrl)
> > >  {
> > >  	struct nvme_ns *ns;
> > > +	struct request_queue *q;
> > > 
> > >  	mutex_lock(&ctrl->namespaces_mutex);
> > >  	list_for_each_entry(ns, &ctrl->namespaces, list) {
> > > -		blk_mq_cancel_requeue_work(ns->queue);
> > > -		blk_mq_stop_hw_queues(ns->queue);
> > > +		q = ns->queue;
> > > +		blk_quiesce_queue(q);
> > > +		blk_mq_cancel_requeue_work(q);
> > > +		blk_mq_stop_hw_queues(q);
> > > +		blk_resume_queue(q);
> > >  	}
> > >  	mutex_unlock(&ctrl->namespaces_mutex);
> > 
> > Hey Bart, should nvme_stop_queues() really be resuming the blk
> > queue?
> 
> Hello Steve,
> 
> Would you perhaps prefer that blk_resume_queue(q) is called from 
> nvme_start_queues()? I think that would make the NVMe code harder to 
> review. The above code won't cause any unexpected side effects if an 
> NVMe namespace is removed after nvme_stop_queues() has been called 
> and before nvme_start_queues() is called. Moving the 
> blk_resume_queue(q) call into nvme_start_queues() will only work as 
> expected if no namespaces are added nor removed between the 
> nvme_stop_queues() and nvme_start_queues() calls. I'm not familiar 
> enough with the NVMe code to know whether or not this change is safe
> ...

It's something that looks obviously wrong, so explain why you need to
do it, preferably in a comment above the function.

James





More information about the Linux-nvme mailing list