[PATCH] nvme: continue keep alive on error

James Smart jsmart2021 at gmail.com
Wed May 9 14:25:43 PDT 2018


Currently, if the keep_alive command failed, an error message is
generated and keep alive is stopped. This guarantees the target will
eventually not see a keep_alive in a KATO window and fail.

The keep_alive command may complete in error in cases where the
transport or lldd are temporarily out of resources. As such, the
command should be retried rather than letting the controller die.

If the command completes in error, retry another one after a short
delay.

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/core.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f3779f350769..6f1b2502fc1c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -791,17 +791,18 @@ static int nvme_submit_user_cmd(struct request_queue *q,
 static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status)
 {
 	struct nvme_ctrl *ctrl = rq->end_io_data;
+	unsigned long delay = ctrl->kato * HZ;
 
 	blk_mq_free_request(rq);
 
 	if (status) {
-		dev_err(ctrl->device,
-			"failed nvme_keep_alive_end_io error=%d\n",
+		dev_info(ctrl->device,
+			"failed nvme_keep_alive_end_io error=%d, retrying\n",
 				status);
-		return;
+		delay = (HZ / 4);	/* 250ms */
 	}
 
-	schedule_delayed_work(&ctrl->ka_work, ctrl->kato * HZ);
+	schedule_delayed_work(&ctrl->ka_work, delay);
 }
 
 static int nvme_keep_alive(struct nvme_ctrl *ctrl)
-- 
2.13.1




More information about the Linux-nvme mailing list