[PATCH rfc] rdma/verbs: fix a possible uaf when draining a srq attached qp

Max Gurtovoy mgurtovoy at nvidia.com
Sun Jun 2 04:43:23 PDT 2024


Hi Sagi,

On 02/06/2024 11:53, Sagi Grimberg wrote:
>
>
> On 02/06/2024 11:19, Leon Romanovsky wrote:
>> On Sun, May 26, 2024 at 11:31:25AM +0300, Sagi Grimberg wrote:
>>> ib_drain_qp does not do drain a shared recv queue (because it is
>>> shared). However in the absence of any guarantees that the recv
>>> completions were consumed, the ulp can reference these completions
>>> after draining the qp and freeing its associated resources, which
>>> is a uaf [1].
>>>
>>> We cannot drain a srq like a normal rq, however in ib_drain_qp
>>> once the qp moved to error state, we reap the recv_cq once in
>>> order to prevent consumption of recv completions after the drain.
>>>
>>> [1]:
>>> -- 
>>> [199856.569999] Unable to handle kernel paging request at virtual 
>>> address 002248778adfd6d0
>>> <....>
>>> [199856.721701] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
>>> <....>
>>> [199856.827281] Call trace:
>>> [199856.829847]  nvmet_parse_admin_cmd+0x34/0x178 [nvmet]
>>> [199856.835007]  nvmet_req_init+0x2e0/0x378 [nvmet]
>>> [199856.839640]  nvmet_rdma_handle_command+0xa4/0x2e8 [nvmet_rdma]
>>> [199856.845575]  nvmet_rdma_recv_done+0xcc/0x240 [nvmet_rdma]
>>> [199856.851109]  __ib_process_cq+0x84/0xf0 [ib_core]
>>> [199856.855858]  ib_cq_poll_work+0x34/0xa0 [ib_core]
>>> [199856.860587]  process_one_work+0x1ec/0x4a0
>>> [199856.864694]  worker_thread+0x48/0x490
>>> [199856.868453]  kthread+0x158/0x160
>>> [199856.871779]  ret_from_fork+0x10/0x18
>>> -- 
>>>
>>> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
>>> ---
>>> Note this patch is not yet tested, but sending it for visibility and
>>> early feedback. While nothing prevents ib_drain_cq to process a cq
>>> directly (even if it has another context) I am not convinced if all
>>> the upper layers don't have any assumptions about a single context
>>> consuming the cq, even if it is while it is drained. It is also
>>> possible to to add ib_reap_cq that fences the cq poll context before
>>> reaping the cq, but this may have other side-effects.
>> Did you have a chance to test this patch?
>> I looked at the code and it seems to be correct change, but I also don't
>> know about all ULP assumptions.
>
> Not yet...
>
> One thing that is problematic with this patch though is that there is no
> stopping condition to the direct poll. So if the CQ is shared among a 
> number of
> qps (and srq's), nothing prevents the polling from consume completions 
> forever...
>
> So we probably need it to be:
> -- 
> diff --git a/drivers/infiniband/core/verbs.c 
> b/drivers/infiniband/core/verbs.c
> index 580e9019e96a..f411fef35938 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -2971,7 +2971,7 @@ void ib_drain_qp(struct ib_qp *qp)
>                  * guarantees that the ulp will free resources and 
> only then
>                  * consume the recv completion.
>                  */
> -               ib_process_cq_direct(qp->recv_cq, -1);
> +               ib_process_cq_direct(qp->recv_cq, qp->recv_cq->cqe);

I tried to fix the problem few years ago in [1] but eventually the 
patchset was not accepted.

We should probably try to improve the original series (using cq->cqe 
instead of -1 for example) and upstream it.

[1] 
https://lore.kernel.org/all/1516197178-26493-1-git-send-email-maxg@mellanox.com/

> }
>  }
>  EXPORT_SYMBOL(ib_drain_qp);
> -- 
>
> While this can also be an overkill, because a shared CQ can be 
> considerably
> larger the the QP size. But at least it is finite.



More information about the Linux-nvme mailing list