[PATCH 2/2] nvme-fc: fix race between error recovery and creating association

Ming Lei ming.lei at redhat.com
Mon Nov 6 16:02:33 PST 2023


On Sat, Jul 8, 2023 at 5:24 AM Michael Liang <mliang at purestorage.com> wrote:
>
> There is a small race window between nvme-fc association creation and error
> recovery. Fix this race condition by protecting accessing to controller
> state and ASSOC_FAILED flag under nvme-fc controller lock.
>
> Signed-off-by: Michael Liang <mliang at purestorage.com>
> Reviewed-by: Caleb Sander <csander at purestorage.com>
> ---

...

> @@ -3172,12 +3179,16 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
>                 else
>                         ret = nvme_fc_recreate_io_queues(ctrl);
>         }
> +
> +       spin_lock_irqsave(&ctrl->lock, flags);
>         if (!ret && test_bit(ASSOC_FAILED, &ctrl->flags))
>                 ret = -EIO;
> -       if (ret)
> +       if (ret) {
> +               spin_unlock_irqrestore(&ctrl->lock, flags);
>                 goto out_term_aen_ops;
> -
> +       }
>         changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE);
> +       spin_unlock_irqrestore(&ctrl->lock, flags);

nvme_change_ctrl_state() may sleep in nvme_kick_requeue_lists(),
this patch has caused regression with
"BUG: scheduling while atomic: kworker/u33:5/31604/0x00000002".

Thanks,




More information about the Linux-nvme mailing list