[PATCH 3/4] nvme-rdma: fix a possible use-after-free in controller reset during load

Sagi Grimberg sagi at grimberg.me
Sun Jan 30 01:21:18 PST 2022


Unlike .queue_rq, .submit_async_event does not check the ctrl and
queue readiness for submitting a AER. This may lead to a use-after-free
condition in the following scenario:
1. nvme_rdma_reset_ctrl_work
2. -> nvme_stop_ctrl flushes ctrl async_event_work
3. ctrl sends AEN which is received by the host, which in turn
   schedules AEN handling
4. teardown admin queue (which releases the rdma queue-pair)
5. AEN processed, submits another AER, calling nvme-rdma driver to submit
   prepares the cmd posts on the admin qp
==> use-after-free accessing an already freed rdma queue-pair

In order to fix that, add ctrl and queue state check to validate the driver
is actually able to accept the rdma qp post.

This solves the above race in the reset flow because the ctrl state is
changed to RESETTING, then the async_event_work is flushed, hence from
that point, any other AER command will find the ctrl state to be RESETTING
and bail out without posting the cmd on the rdma qp.

CC: stable at vger.kernel.org
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/rdma.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 850f84d204d0..780e8fbf503f 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1714,9 +1714,13 @@ static void nvme_rdma_submit_async_event(struct nvme_ctrl *arg)
 	struct ib_device *dev = queue->device->dev;
 	struct nvme_rdma_qe *sqe = &ctrl->async_event_sqe;
 	struct nvme_command *cmd = sqe->data;
+	bool queue_ready = test_bit(NVME_RDMA_Q_LIVE, &queue->flags);
 	struct ib_sge sge;
 	int ret;
 
+	if (ctrl->ctrl.state != NVME_CTRL_LIVE || !queue_ready)
+		return;
+
 	ib_dma_sync_single_for_cpu(dev, sqe->dma, sizeof(*cmd), DMA_TO_DEVICE);
 
 	memset(cmd, 0, sizeof(*cmd));
-- 
2.30.2




More information about the Linux-nvme mailing list