[PATCH 04/17] nvme: don't call nvme_kill_queues from nvme_remove_namespaces
Sagi Grimberg
sagi at grimberg.me
Tue Oct 25 13:17:04 PDT 2022
On 10/25/22 20:43, Keith Busch wrote:
> On Tue, Oct 25, 2022 at 07:40:07AM -0700, Christoph Hellwig wrote:
>> @@ -4560,15 +4560,6 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl)
>> /* prevent racing with ns scanning */
>> flush_work(&ctrl->scan_work);
>>
>> - /*
>> - * The dead states indicates the controller was not gracefully
>> - * disconnected. In that case, we won't be able to flush any data while
>> - * removing the namespaces' disks; fail all the queues now to avoid
>> - * potentially having to clean up the failed sync later.
>> - */
>> - if (ctrl->state == NVME_CTRL_DEAD)
>> - nvme_kill_queues(ctrl);
>> -
>> /* this is a no-op when called from the controller reset handler */
>> nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING_NOIO);
>>
>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>> index ec034d4dd9eff..f971e96ffd3f6 100644
>> --- a/drivers/nvme/host/pci.c
>> +++ b/drivers/nvme/host/pci.c
>> @@ -3249,6 +3249,16 @@ static void nvme_remove(struct pci_dev *pdev)
>>
>> flush_work(&dev->ctrl.reset_work);
>> nvme_stop_ctrl(&dev->ctrl);
>> +
>> + /*
>> + * The dead states indicates the controller was not gracefully
>> + * disconnected. In that case, we won't be able to flush any data while
>> + * removing the namespaces' disks; fail all the queues now to avoid
>> + * potentially having to clean up the failed sync later.
>> + */
>> + if (dev->ctrl.state == NVME_CTRL_DEAD)
>> + nvme_kill_queues(&dev->ctrl);
>> +
>> nvme_remove_namespaces(&dev->ctrl);
>> nvme_dev_disable(dev, true);
>> nvme_remove_attrs(dev);
>> --
>> 2.30.2
>>
>
> We still need the flush_work(scan_work) prior to killing the queues. It
> looks like it could safely be moved to nvme_stop_ctrl(), which might
> make it easier on everyone if it were there.
If we do end up moving it to nvme_stop_ctrl, can we make a sub-version
of nvme_stop_ctrl that cannot block on I/O (i.e. without ana/scan/auth)?
for multipathing where we want to teardown the controller quickly so we
can failover I/O asap.
IIRC this is why scan_work is not in nvme_stop_ctrl to begin with, but
it is also possible that there was some other deadlock caused by that.
More information about the Linux-nvme
mailing list