[PATCH] nvmet: release the sq ref on rdma read errors

Sagi Grimberg sagi at grimberg.me
Wed May 3 22:50:56 PDT 2017


> Nice catch Vijay!
>
> Reviewed-by: Sagi Grimberg <sagi at grimberg.me>


Wait... let me take that back.

While it is true that we need to drop the reference on
the nvmet_sq, there is no point in queuing a response
message because the rdma qp is in error state and the response
will never make it to the host.

Moreover, posting a send (and a recv) on a qp in error state can
potentially give us a flush completion after we drained the qp which can
trigger a use-after-free condition. We rely on ib_drain_qp to
guarantee that we'll never see more completions for this queue
and we can safely free the resources.

I think we should explicitly drop the sq reference and release
the rsp and avoid triggering the TX path, and provide a detailed
comment on why we are doing this. Maybe a nicer way to do this,
is to introduce a nvme_req_uninit() that would take care of
it in the right layer (that is nvmet core).

CC'ing Steve and Christoph for their thoughts...



More information about the Linux-nvme mailing list