[PATCH v2] nvmet-rdma: fix possible bad dereference when freeing rsps
Keith Busch
kbusch at kernel.org
Wed May 8 06:20:13 PDT 2024
On Wed, May 08, 2024 at 10:53:06AM +0300, Sagi Grimberg wrote:
> It is possible that the host connected and saw a cm established
> event and started sending nvme capsules on the qp, however the
> ctrl did not yet see an established event. This is why the
> rsp_wait_list exists (for async handling of these cmds, we move
> them to a pending list).
>
> Furthermore, it is possible that the ctrl cm times out, resulting
> in a connect-error cm event. in this case we hit a bad deref [1]
> because in nvmet_rdma_free_rsps we assume that all the responses
> are in the free list.
>
> We are freeing the cmds array anyways, so don't even bother to
> remove the rsp from the free_list. It is also guaranteed that we
> are not racing anything when we are releasing the queue so no
> other context accessing this array should be running.
>
> [1]:
> --
> Workqueue: nvmet-free-wq nvmet_rdma_free_queue_work [nvmet_rdma]
> [...]
> pc : nvmet_rdma_free_rsps+0x78/0xb8 [nvmet_rdma]
> lr : nvmet_rdma_free_queue_work+0x88/0x120 [nvmet_rdma]
> Call trace:
> nvmet_rdma_free_rsps+0x78/0xb8 [nvmet_rdma]
> nvmet_rdma_free_queue_work+0x88/0x120 [nvmet_rdma]
> process_one_work+0x1ec/0x4a0
> worker_thread+0x48/0x490
> kthread+0x158/0x160
> ret_from_fork+0x10/0x18
> --
>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
Thanks, applied to nvme-6.9.
More information about the Linux-nvme
mailing list