[PATCH 1/2] nvme: fixup kato deadlock
Hannes Reinecke
hare at suse.de
Tue Feb 23 07:07:27 EST 2021
A customer of ours has run into this deadlock with RDMA:
- The ka_work workqueue item is executed
- A new ka_work workqueue item is scheduled just after that.
- Now both, the kato request timeout _and_ the workqueue delay
will execute at roughly the same time
- If the timing is correct the workqueue executes _before_
the kato request timeout triggers
- Kato request timeout triggers, and starts error recovery
- error recovery deadlocks, as it needs to flush the kato
workqueue item; this is stuck in nvme_alloc_request() as all
reserved tags are in use.
The reserved tags would have been freed up later when cancelling all
outstanding requests in the queue:
nvme_stop_keep_alive(&ctrl->ctrl);
nvme_rdma_teardown_io_queues(ctrl, false);
nvme_start_queues(&ctrl->ctrl);
nvme_rdma_teardown_admin_queue(ctrl, false);
blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);
but as we're stuck in nvme_stop_keep_alive() we'll never get this far.
To fix this a new controller flag 'NVME_CTRL_KATO_RUNNING' is added
which will short-circuit the nvme_keep_alive() function if one
keep-alive command is already running.
Cc: Daniel Wagner <dwagner at suse.de>
Signed-off-by: Hannes Reinecke <hare at suse.de>
---
drivers/nvme/host/core.c | 8 +++++++-
drivers/nvme/host/nvme.h | 1 +
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ea40a3c511da..9b8596eb4047 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1211,6 +1211,7 @@ static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status)
bool startka = false;
blk_mq_free_request(rq);
+ clear_bit(NVME_CTRL_KATO_RUNNING, &ctrl->flags);
if (status) {
dev_err(ctrl->device,
@@ -1233,10 +1234,15 @@ static int nvme_keep_alive(struct nvme_ctrl *ctrl)
{
struct request *rq;
+ if (test_and_set_bit(NVME_CTRL_KATO_RUNNING, &ctrl->flags))
+ return 0;
+
rq = nvme_alloc_request(ctrl->admin_q, &ctrl->ka_cmd,
BLK_MQ_REQ_RESERVED);
- if (IS_ERR(rq))
+ if (IS_ERR(rq)) {
+ clear_bit(NVME_CTRL_KATO_RUNNING, &ctrl->flags);
return PTR_ERR(rq);
+ }
rq->timeout = ctrl->kato * HZ;
rq->end_io_data = ctrl;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index e6efa085f08a..e00e3400c8b6 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -344,6 +344,7 @@ struct nvme_ctrl {
int nr_reconnects;
unsigned long flags;
#define NVME_CTRL_FAILFAST_EXPIRED 0
+#define NVME_CTRL_KATO_RUNNING 1
struct nvmf_ctrl_options *opts;
struct page *discard_page;
--
2.29.2
More information about the Linux-nvme
mailing list