[PATCH rfc] rdma/verbs: fix a possible uaf when draining a srq attached qp
Sagi Grimberg
sagi at grimberg.me
Sun May 26 01:31:25 PDT 2024
ib_drain_qp does not do drain a shared recv queue (because it is
shared). However in the absence of any guarantees that the recv
completions were consumed, the ulp can reference these completions
after draining the qp and freeing its associated resources, which
is a uaf [1].
We cannot drain a srq like a normal rq, however in ib_drain_qp
once the qp moved to error state, we reap the recv_cq once in
order to prevent consumption of recv completions after the drain.
[1]:
--
[199856.569999] Unable to handle kernel paging request at virtual address 002248778adfd6d0
<....>
[199856.721701] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
<....>
[199856.827281] Call trace:
[199856.829847] nvmet_parse_admin_cmd+0x34/0x178 [nvmet]
[199856.835007] nvmet_req_init+0x2e0/0x378 [nvmet]
[199856.839640] nvmet_rdma_handle_command+0xa4/0x2e8 [nvmet_rdma]
[199856.845575] nvmet_rdma_recv_done+0xcc/0x240 [nvmet_rdma]
[199856.851109] __ib_process_cq+0x84/0xf0 [ib_core]
[199856.855858] ib_cq_poll_work+0x34/0xa0 [ib_core]
[199856.860587] process_one_work+0x1ec/0x4a0
[199856.864694] worker_thread+0x48/0x490
[199856.868453] kthread+0x158/0x160
[199856.871779] ret_from_fork+0x10/0x18
--
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
Note this patch is not yet tested, but sending it for visibility and
early feedback. While nothing prevents ib_drain_cq to process a cq
directly (even if it has another context) I am not convinced if all
the upper layers don't have any assumptions about a single context
consuming the cq, even if it is while it is drained. It is also
possible to to add ib_reap_cq that fences the cq poll context before
reaping the cq, but this may have other side-effects.
This crash was seen in the wild, and not easy to reproduce. I suspect
that moving the nvmet_wq to be unbound expedited the teardown process
exposing a possible uaf for srq attached qps (which nvmet-rdma has a
mode of using).
drivers/infiniband/core/verbs.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 94a7f3b0c71c..580e9019e96a 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -2962,6 +2962,17 @@ void ib_drain_qp(struct ib_qp *qp)
ib_drain_sq(qp);
if (!qp->srq)
ib_drain_rq(qp);
+ else {
+ /*
+ * We cannot drain a srq, however the qp is in error state,
+ * and will not generate new recv completions, hence it should
+ * be enough to reap the recv cq to cleanup any recv completions
+ * that may have placed before we drained. Without this nothing
+ * guarantees that the ulp will free resources and only then
+ * consume the recv completion.
+ */
+ ib_process_cq_direct(qp->recv_cq, -1);
+ }
}
EXPORT_SYMBOL(ib_drain_qp);
--
2.40.1
More information about the Linux-nvme
mailing list