[PATCH] NVMe: Fix possible scheduling while atomic error
Keith Busch
keith.busch at intel.com
Mon May 23 07:55:56 PDT 2016
On Mon, May 23, 2016 at 07:36:31AM -0700, Christoph Hellwig wrote:
> On Mon, May 23, 2016 at 09:41:16AM -0400, Keith Busch wrote:
> > On Mon, May 23, 2016 at 03:58:07AM -0700, Christoph Hellwig wrote:
> > > On Tue, May 17, 2016 at 03:37:42PM -0600, Keith Busch wrote:
> > > > rcu_read_lock();
> > > > list_for_each_entry_rcu(ns, &ctrl->namespaces, list) {
> > > > - queue_flag_clear_unlocked(QUEUE_FLAG_STOPPED, ns->queue);
> > > > + spin_lock_irq(ns->queue->queue_lock);
> > > > + queue_flag_clear(QUEUE_FLAG_STOPPED, ns->queue);
> > > > + spin_unlock_irq(ns->queue->queue_lock);
> > >
> > > What's the rationale for this change?
> >
> > That was to make it so nvme_queue_rq wouldn't see a stopped flag just
> > before it was starting.
>
> I don't really see how that helps us with any race..
Without the additional spin lock protecting QUEUE_FLAG_STOPPED:
CPU A CPU B
----- -----
nvme_start_queues nvme_queue_rq
if (nvmeq->cq_vector < 0) <-- queue is disabled during reset, returning BLK_MQ_RQ_QUEUE_BUSY
goto out;
if (blk_queue_stopped()) <-- returns true
queue_flag_clear
blk_mq_start_stopped_hw_queues
blk_mq_stop_hw_queues() <-- incorrectly stops h/w queues
Locking the queue in nvme_start_queues before clearing the flag fixes
that up.
> > It's moved out of that function holding the rcu read lock since
> > blk_mq_stop_hw_queues potentially sleeps.
>
> blk_mq_stop_hw_queues is a loop around blk_mq_stop_hw_queue,
> which itself does two calls to cancel_delayed_work and a set_bit.
> None of them can block.
Oops, I'm crossing wires and mixing up what API's do what ... need to
wake up before starting work.
More information about the Linux-nvme
mailing list