[PATCH for-4.5 10/13] NVMe: Move error handling to failed reset handler

Keith Busch keith.busch at intel.com
Thu Feb 11 07:11:47 PST 2016


On Thu, Feb 11, 2016 at 02:50:54PM +0200, Sagi Grimberg wrote:
> On 10/02/2016 20:17, Keith Busch wrote:
> >This moves the dead queue handling out of the namespace removal path
> >and into the reset failure path. It fixes a deadlock condition if the
> >controller fails or link down during del_gendisk.
> 
> How does it fix the deadlock?

Previously the queues were setup for failure prior to calling del_gendisk
only if the controller was broken. If the controller happened to be
optimal, this process would have been skipped. If the controller then
failed, the queues wouldn't be killed.
 
> >+	nvme_dev_disable(dev, false);
> >+
> >+	mutex_lock(&ctrl->namespaces_mutex);
> >+	list_for_each_entry(ns, &ctrl->namespaces, list) {
> >+		if (!kref_get_unless_zero(&ns->kref))
> >+			continue;
> >+
> >+		blk_set_queue_dying(ns->queue);
> >+		blk_mq_abort_requeue_list(ns->queue);
> >+		blk_mq_start_stopped_hw_queues(ns->queue, true);
> >+
> >+		nvme_put_ns(ns);
> >+	}
> >+	mutex_unlock(&ctrl->namespaces_mutex);
> >+}
> >+
> 
> Why on earth is this pci specific? This should be in the
> core. Aside from that, I'd really prefer if the core can handle this
> without having the pci (or other) triggering it explicitly, but if
> this must move out of the ns remove then we need documentation on
> what are the rules of when the driver needs to call it.

It's PCI specific only because of the potential need to disable the
controller first (nvme_dev_disable), which is currently PCI specific.



More information about the Linux-nvme mailing list