[PATCH] NVMe: Avoid interrupt disable during queue init.

Keith Busch keith.busch at intel.com
Fri May 22 08:11:44 PDT 2015

On Fri, 22 May 2015, Parav Pandit wrote:
> On Fri, May 22, 2015 at 8:18 PM, Keith Busch <keith.busch at intel.com> wrote:
>> The rcu protection on nvme queues was removed with the blk-mq conversion
>> as we rely on that layer for h/w access.
> o.k. But above is at level where data I/Os are not even active. Its
> between nvme_kthread and nvme_resume() from power management
> subsystem.
> I must be missing something.

On resume, everything is already reaped from the queues, so there should
be no harm letting the kthread poll an inactive queue. The proposal to
remove the q_lock during queue init makes it possible for the thread to
see the wrong cq phase bit and mess up the completion queue's head from
reaping non-existent entries.

But beyond nvme_resume, it appears a race condition is possible on any
scenario when a device is reinitialized if it cannot create the same
number of IO queues as it had in originally. Part of the problem is there
doesn't seem to be a way to change a tagset's nr_hw_queues after it was
created. The conditions that leads to this scenario should be uncommon,
so I haven't given it much thought; I need to untangle dynamic namespaces
first. :)

More information about the Linux-nvme mailing list