[PATCH V5 9/9] nvme: pci: support nested EH
jianchao.wang
jianchao.w.wang at oracle.com
Tue May 15 03:02:13 PDT 2018
Hi ming
On 05/11/2018 08:29 PM, Ming Lei wrote:
> +static void nvme_eh_done(struct nvme_eh_work *eh_work, int result)
> +{
> + struct nvme_dev *dev = eh_work->dev;
> + bool top_eh;
> +
> + spin_lock(&dev->eh_lock);
> + top_eh = list_is_last(&eh_work->list, &dev->eh_head);
> + dev->nested_eh--;
> +
> + /* Fail controller if the top EH can't recover it */
> + if (!result)
> + wake_up_all(&dev->eh_wq);
> + else if (top_eh) {
> + dev->ctrl_failed = true;
> + nvme_eh_sched_fail_ctrl(dev);
> + wake_up_all(&dev->eh_wq);
> + }
> +
> + list_del(&eh_work->list);
> + spin_unlock(&dev->eh_lock);
> +
> + dev_info(dev->ctrl.device, "EH %d: state %d, eh_done %d, top eh %d\n",
> + eh_work->seq, dev->ctrl.state, result, top_eh);
> + wait_event(dev->eh_wq, nvme_eh_reset_done(dev));
decrease the nested_eh before it exits, another new EH will have confusing seq number.
please refer to following log:
[ 1342.961869] nvme nvme0: Abort status: 0x0
[ 1342.961878] nvme nvme0: Abort status: 0x0
[ 1343.148341] nvme nvme0: EH 0: after shutdown, top eh: 1
[ 1403.828484] nvme nvme0: I/O 21 QID 0 timeout, disable controller
[ 1403.828603] nvme nvme0: EH 1: before shutdown
... waring logs are ignored here
[ 1403.984731] nvme nvme0: EH 0: state 4, eh_done -4, top eh 0 // EH0 go to wait
[ 1403.984786] nvme nvme0: EH 1: after shutdown, top eh: 1
[ 1464.856290] nvme nvme0: I/O 22 QID 0 timeout, disable controller // timeout again in EH 1
[ 1464.856411] nvme nvme0: EH 1: before shutdown // a new EH has a 1 seq number
Is it expected that the new EH has seq number 1 instead of 2 ?
Thanks
Jianchao
More information about the Linux-nvme
mailing list