[PATCH v2 1/2] nvme: switch to RCU freeing the namespace
Ming Lin
mlin at kernel.org
Fri May 20 10:57:13 PDT 2016
On Fri, May 20, 2016 at 7:16 AM, Keith Busch <keith.busch at intel.com> wrote:
> On Thu, May 19, 2016 at 04:48:29PM -0400, Keith Busch wrote:
>> On Wed, May 18, 2016 at 10:52:26PM -0700, Ming Lin wrote:
>> > I have not found the root cause yet.
>> > Below patch makes reset not occur during active scan work.
>> > And I didn't see the crash any more with this patch.
>> >
>> > So it seems there is a race somewhere between reset work and scan work.
>
> I haven't been able to reproduce the same failure you're getting. What I
> see is namespaces being validated fail because the user interrupted the
> process by disabling the controller in the middle of discovery. That's
> not good either.
Sorry I should mention that I was testing nvme-over-fabric tree.
I can't reproduce it either with Linus tree.
>
> We can't just fence off resets during discovery since we need to let
> recovery occur if a controller fails.
>
> There's no good reason to let a user interrupt the process, though.
>
> How about for user initiated resets, we synchronize with the scan
> work and set a controller state that allows resets but not scan:
I didn't see the crash any more with below patch applied to
nvme-over-fabric tree.
I'll still try to find out what the root cause is.
Thanks.
>
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 1a51584..f7e15df 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1209,6 +1209,13 @@ out_unlock:
> return ret;
> }
>
> +static int nvme_reset_ctrl(struct nvme_ctrl *ctrl)
> +{
> + ctrl->state = NVME_CTRL_NEW;
> + flush_work(&ctrl->scan_work);
> + return ctrl->ops->reset_ctrl(ctrl);
> +}
> +
> static long nvme_dev_ioctl(struct file *file, unsigned int cmd,
> unsigned long arg)
> {
> @@ -1222,7 +1229,7 @@ static long nvme_dev_ioctl(struct file *file, unsigned int cmd,
> return nvme_dev_user_cmd(ctrl, argp);
> case NVME_IOCTL_RESET:
> dev_warn(ctrl->device, "resetting controller\n");
> - return ctrl->ops->reset_ctrl(ctrl);
> + return nvme_reset_ctrl(ctrl);
> case NVME_IOCTL_SUBSYS_RESET:
> return nvme_reset_subsystem(ctrl);
> case NVME_IOCTL_RESCAN:
> @@ -1248,7 +1255,7 @@ static ssize_t nvme_sysfs_reset(struct device *dev,
> struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> int ret;
>
> - ret = ctrl->ops->reset_ctrl(ctrl);
> + ret = nvme_reset_ctrl(ctrl);
> if (ret < 0)
> return ret;
> return count;
> --
More information about the Linux-nvme
mailing list