[PATCH for-4.5 10/13] NVMe: Move error handling to failed reset handler
Keith Busch
keith.busch at intel.com
Thu Feb 11 07:11:47 PST 2016
On Thu, Feb 11, 2016 at 02:50:54PM +0200, Sagi Grimberg wrote:
> On 10/02/2016 20:17, Keith Busch wrote:
> >This moves the dead queue handling out of the namespace removal path
> >and into the reset failure path. It fixes a deadlock condition if the
> >controller fails or link down during del_gendisk.
>
> How does it fix the deadlock?
Previously the queues were setup for failure prior to calling del_gendisk
only if the controller was broken. If the controller happened to be
optimal, this process would have been skipped. If the controller then
failed, the queues wouldn't be killed.
> >+ nvme_dev_disable(dev, false);
> >+
> >+ mutex_lock(&ctrl->namespaces_mutex);
> >+ list_for_each_entry(ns, &ctrl->namespaces, list) {
> >+ if (!kref_get_unless_zero(&ns->kref))
> >+ continue;
> >+
> >+ blk_set_queue_dying(ns->queue);
> >+ blk_mq_abort_requeue_list(ns->queue);
> >+ blk_mq_start_stopped_hw_queues(ns->queue, true);
> >+
> >+ nvme_put_ns(ns);
> >+ }
> >+ mutex_unlock(&ctrl->namespaces_mutex);
> >+}
> >+
>
> Why on earth is this pci specific? This should be in the
> core. Aside from that, I'd really prefer if the core can handle this
> without having the pci (or other) triggering it explicitly, but if
> this must move out of the ns remove then we need documentation on
> what are the rules of when the driver needs to call it.
It's PCI specific only because of the potential need to disable the
controller first (nvme_dev_disable), which is currently PCI specific.
More information about the Linux-nvme
mailing list