[PATCH V3 2/2] nvme: Add support for FW activation without reset
Arnav Dawn
a.dawn at samsung.com
Thu Jun 29 06:09:51 PDT 2017
On Sunday 25 June 2017 02:49 PM, Arnav Dawn wrote:
>
> On Friday 23 June 2017 08:55 PM, Keith Busch wrote:
>> On Fri, Jun 23, 2017 at 05:41:21PM +0530, Arnav Dawn wrote:
>>> On Wednesday 21 June 2017 07:07 PM, Keith Busch wrote:
>>>> One last thing I noticed, it looks like the nvme_fw_act_work will break
>>>> if an IO timeout occurs while it's running since that may reset the
>>>> controller and restart IO queues.
>>> could you please clarify more?
>>> I think ,that, if an IO Timeout resets controller and restarts queue,
>>> the nvme_ctrl_pp_status will return false, ending the nvme_fw_act_work.
>> Sure, the nvme_fw_act_work will observe CSTS.PP cleared, then it restarts
>> the IO queues, but the nvme reset requires those queues to be quiesced.
> Thanks for clarifying,
> i have made following changes based on comments, please review.
>
> 1. Added a delay of 100ms for scheduling fw_act_work,
> as work without delay could schedule almost immediately which
> may not give time to the controller to set CSTS.PP.
> Also, Moved fw_act_timeout calculation to the fw_act_work.
>
> + case NVME_AER_NOTICE_FW_ACT_STARTING:
> + schedule_delayed_work(&ctrl->fw_act_work,
> msecs_to_jiffies(100));
> + break;
> + case NVME_AER_ERR_FW_IMG_LOAD:
> + dev_warn(ctrl->device, "FW image load error\n");
> + cancel_delayed_work(&ctrl->fw_act_work);
> + break;
>
>
> 2. Added a check, that the controller is in NVME_CTRL_LIVE state
> before starting queues once PP is cleared or fw activation times out.
> This prevents starting of queues unless the controller is in Live state.
>
> +static void nvme_fw_act_work(struct work_struct *work)
> +{
> + struct nvme_ctrl *ctrl = container_of(to_delayed_work(work),
> + struct nvme_ctrl, fw_act_work);
> + struct nvme_fw_slot_info_log *log;
> + unsigned long fw_act_timeout;
> +
> + if (ctrl->mtfa)
> + fw_act_timeout = jiffies +
> + msecs_to_jiffies(ctrl->mtfa * 100);
> + else
> + fw_act_timeout = jiffies +
> + msecs_to_jiffies(admin_timeout * 1000);
> +
> + nvme_stop_queues(ctrl);
> + while (nvme_ctrl_pp_status(ctrl)) {
> + if (time_after(jiffies, fw_act_timeout)) {
> + dev_warn(ctrl->device,
> + "Fw activation timeout, reset
> controller\n");
> + ctrl->ops->reset_ctrl(ctrl);
> + break;
> + }
> + msleep(100);
> + }
> +
> + if(ctrl->state != NVME_CTRL_LIVE)
> + return;
> +
> + nvme_start_queues(ctrl);
> + /* read FW slot informationi to clear the AER*/
> + log = kmalloc(sizeof(struct nvme_fw_slot_info_log), GFP_KERNEL);
> + if (!log)
> + return;
> +
> + if (nvme_get_fw_slot_info(ctrl, log))
> + dev_warn(ctrl->device,
> + "Get FW SLOT INFO log error\n");
> + kfree(log);
> +}
>
> if this is fine i will send out the next version of patch.
>
>
could you please review this, i will send out the patch if it looks ok.
Thanks
Arnav
More information about the Linux-nvme
mailing list