[PATCH 1/1] nvme-rdma: Fix memory leak during queue allocation
Sagi Grimberg
sagi at grimberg.me
Thu Nov 9 03:02:49 PST 2017
> In case nvme_rdma_wait_for_cm timeout expires before we get
> an established or rejected event (rdma_connect succeeded) from
> rdma_cm, we end up with leaking the ib resources for dedicated
> queue.
> This scenario can easily reproduced using traffic test during port
> toggling.
>
> Signed-off-by: Max Gurtovoy <maxg at mellanox.com>
> ---
> drivers/nvme/host/rdma.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 0ebb539..fcb278a 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -545,13 +545,16 @@ static int nvme_rdma_alloc_queue(struct nvme_rdma_ctrl *ctrl,
> if (ret) {
> dev_info(ctrl->ctrl.device,
> "rdma_resolve_addr wait failed (%d).\n", ret);
Are you rebased? this message have changed I think.
> - goto out_destroy_cm_id;
> + goto out_destroy_queue_ib;
> }
>
> clear_bit(NVME_RDMA_Q_DELETING, &queue->flags);
>
> return 0;
>
> +out_destroy_queue_ib:
> + if (ret == -ETIMEDOUT)
> + nvme_rdma_destroy_queue_ib(queue);
This does not look safe to me. What protects that nvme_rdma_cm_handler
will not destroy the ib queue as well? I think we need to destroy the
cm_id first (guarantee that we will never handle other cma events)
and only then destroy the ib queue if needed.
More information about the Linux-nvme
mailing list