[PATCH v3 2/2] nvme-core: fix deadlock in disconnect during scan_work and/or ana_work
Sagi Grimberg
sagi at grimberg.me
Thu Jul 23 21:03:30 EDT 2020
> Actually, I think that the design was to unblock the scan_work and that
> is why nvme_mpath_clear_ctrl_paths was placed before (as the comment
> say).
>
> But looking at the implementation of nvme_mpath_clear_ctrl_paths, it's
> completely unclear why it should take the scan_lock. It is just clearing
> the paths..
>
> I think that the correct patch would be to just not take the scan_lock
> and only take the namespaces_rwsem.
OK, I was able to reproduce this on my setup.
What was needed is that fabrics will allow I/O to pass in
NVME_CTRL_DELETING, which needed this add-on:
--
nvme-fabrics: don't fast fail on ctrl state DELETING
This is now an state that allows for I/O to be sent to the
device, and when the device shall transition into
NVME_CTRL_DELETING_NOIO we shall fail the I/O.
Note that this is fine because the transport itself has
a queue state to protect against queue access.
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index a0ec40ab62ee..a9c1e3b4585e 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -182,7 +182,8 @@ bool nvmf_ip_options_match(struct nvme_ctrl *ctrl,
static inline bool nvmf_check_ready(struct nvme_ctrl *ctrl, struct
request *rq,
bool queue_live)
{
- if (likely(ctrl->state == NVME_CTRL_LIVE))
+ if (likely(ctrl->state == NVME_CTRL_LIVE ||
+ ctrl->state == NVME_CTRL_DELETING))
return true;
return __nvmf_check_ready(ctrl, rq, queue_live);
}
--
Logan,
Can you verify that it works for you?
BTW, I'm still seriously suspicious on why nvme_mpath_clear_ctrl_paths
is taking the scan_lock. It appears that it shouldn't. I'm tempted to
remove it and see if anyone complains...
More information about the Linux-nvme
mailing list