slab-use-after-free in __ib_process_cq
Sagi Grimberg
sagi at grimberg.me
Thu May 4 08:45:54 PDT 2023
>>> Hi,
>>>
>>> While testing Jens' for-next branch I encountered a use-after-free
>>> issue, triggered by test nvmeof-mp/002. This is not the first time I see
>>> this issue - I had already observed this several weeks ago but I had not
>>> yet had the time to report this.
>>
>> That is surprising because this area did not change for quite a while
>> now.
>>
>> CCing linux-rdma as well, I'm assuming that this is with rxe?
>> Does this happen with siw as well?
>
> Hi Sagi,
>
> This happened with the siw driver. I haven't tried the rxe driver for a
> while.
>
> The crash addresses correspond to the following source file and line:
>
> (gdb) list *(__ib_process_cq+0x11c)
> 0x7f7c is in __ib_process_cq (drivers/infiniband/core/cq.c:110).
> 105 budget - completed),
> wcs)) > 0) {
> 106 for (i = 0; i < n; i++) {
> 107 struct ib_wc *wc = &wcs[i];
> 108
> 109 if (wc->wr_cqe)
> 110 wc->wr_cqe->done(cq, wc);
> 111 else
> 112 WARN_ON_ONCE(wc->status ==
> IB_WC_SUCCESS);
> 113 }
> 114
>
> (gdb) list *(nvme_rdma_create_queue_ib+0x1a7)
> 0x3d47 is in nvme_rdma_create_queue_ib (drivers/nvme/host/rdma.c:219).
> 214 {
> 215 struct nvme_rdma_qe *ring;
> 216 int i;
> 217
> 218 ring = kcalloc(ib_queue_size, sizeof(struct
> nvme_rdma_qe), GFP_KERNEL);
> 219 if (!ring)
> 220 return NULL;
> 221
> 222 /*
> 223 * Bind the CQEs (post recv buffers) DMA mapping to the
> RDMA queue
>
> (gdb) list *(nvme_rdma_destroy_queue_ib+0x1b8)
> 0x2388 is in nvme_rdma_destroy_queue_ib (drivers/nvme/host/rdma.c:358).
> 353 kfree(ndev);
> 354 }
> 355
> 356 static void nvme_rdma_dev_put(struct nvme_rdma_device *dev)
> 357 {
> 358 kref_put(&dev->ref, nvme_rdma_free_dev);
> 359 }
> 360
> 361 static int nvme_rdma_dev_get(struct nvme_rdma_device *dev)
> 362 {
>
> Shouldn't ib_drain_qp() be called before nvme_rdma_destroy_queue_ib()
> destroys the QP?
Yes it absolutely should, and it is according to the code.
The only way that this can happen is something happens to
post a wr after the drain started, can't see how this happens though...
More information about the Linux-nvme
mailing list