[PATCH v2 1/2] nvme: switch to RCU freeing the namespace
Ming Lin
mlin at kernel.org
Wed May 18 22:52:26 PDT 2016
On Tue, 2016-05-17 at 17:25 -0400, Keith Busch wrote:
> On Tue, May 17, 2016 at 02:09:34PM -0700, Ming Lin wrote:
> > On Tue, May 17, 2016 at 2:07 PM, Keith Busch <keith.busch at intel.com
> > > wrote:
> > > Great, thanks. I was getting good results with this as well.
> >
> > Thanks for the fix. Could you send the patch formally?
>
> Will do. Sending shortly.
>
> > > Bummer. Is your controller using sparse namespaces? The kernel
> > > message
> > > before the bug appears to indicate that.
> >
> > No. Only 1 namespace.
>
> Something must be corrupt then. The below line shows an unallocated
> namespace is failing to identify itself, but if you only report 1 ns,
> we shouldn't have been able to get here from a simple nvme reset.
>
> I think your resets are occuring faster than we anticipated and
> you've
> uncovered another bug. It looks like these may cause trouble if reset
> occurs during active scan work.
I have not found the root cause yet.
Below patch makes reset not occur during active scan work.
And I didn't see the crash any more with this patch.
So it seems there is a race somewhere between reset work and scan work.
drivers/nvme/host/core.c | 13 ++++++++++++-
drivers/nvme/host/nvme.h | 1 +
drivers/nvme/host/pci.c | 3 +++
3 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a57ccd3..8560774 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -89,6 +89,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
case NVME_CTRL_NEW:
case NVME_CTRL_RESETTING:
case NVME_CTRL_RECONNECTING:
+ case NVME_CTRL_SCANING:
changed = true;
/* FALLTHRU */
default:
@@ -126,6 +127,14 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
break;
}
break;
+ case NVME_CTRL_SCANING:
+ switch (old_state) {
+ case NVME_CTRL_LIVE:
+ changed = true;
+ /* FALLTHRU */
+ default:
+ break;
+ }
default:
break;
}
@@ -1755,7 +1764,7 @@ static void nvme_scan_work(struct work_struct *work)
struct nvme_id_ctrl *id;
unsigned nn;
- if (ctrl->state != NVME_CTRL_LIVE)
+ if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_SCANING))
return;
if (nvme_identify_ctrl(ctrl, &id))
@@ -1776,6 +1785,8 @@ static void nvme_scan_work(struct work_struct *work)
if (ctrl->ops->post_scan)
ctrl->ops->post_scan(ctrl);
+
+ nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE);
}
void nvme_queue_scan(struct nvme_ctrl *ctrl)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 3f3945a..2827825 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -76,6 +76,7 @@ enum nvme_ctrl_state {
NVME_CTRL_RESETTING,
NVME_CTRL_RECONNECTING,
NVME_CTRL_DELETING,
+ NVME_CTRL_SCANING,
};
struct nvme_ctrl {
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 02105da..71260c8 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1761,6 +1761,9 @@ static void nvme_reset_work(struct work_struct *work)
struct nvme_dev *dev = container_of(work, struct nvme_dev, reset_work);
int result = -ENODEV;
+ if (dev->ctrl.state == NVME_CTRL_SCANING)
+ return;
+
if (WARN_ON(dev->ctrl.state == NVME_CTRL_RESETTING))
goto out;
More information about the Linux-nvme
mailing list