[PATCH] nvme-tcp: fix incorrect h2cdata pdu offset accounting

Keith Busch kbusch at kernel.org
Thu Sep 16 09:45:59 PDT 2021


On Tue, Sep 14, 2021 at 06:38:55PM +0300, Sagi Grimberg wrote:
> When the controller sends us multiple r2t PDUs in a single
> request we need to account for it correctly as our send/recv
> context run concurrently (i.e. we get a new r2t with r2t_offset
> before we updated our iterator and req->data_sent marker). This
> can cause wrong offsets to be sent to the controller.
> 
> To fix that, we will first know that this may happen only in
> the send sequence of the last page, hence we will take
> the r2t_offset to the h2c PDU data_offset, and in
> nvme_tcp_try_send_data loop, we make sure to increment
> the request markers also when we completed a PDU but
> we are expecting more r2t PDUs as we still did not send
> the entire data of the request.
> 
> Fixes: 825619b09ad3 ("nvme-tcp: fix possible use-after-completion")
> Reported-by: Nowak, Lukasz <Lukasz.Nowak at Dell.com>
> Tested-by: Nowak, Lukasz <Lukasz.Nowak at Dell.com>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
> - Keith, can you ask the WD team to test this as well?
> 
>  drivers/nvme/host/tcp.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 645025620154..87b73eb94041 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -607,7 +607,7 @@ static int nvme_tcp_setup_h2c_data_pdu(struct nvme_tcp_request *req,
>  		cpu_to_le32(data->hdr.hlen + hdgst + req->pdu_len + ddgst);
>  	data->ttag = pdu->ttag;
>  	data->command_id = nvme_cid(rq);
> -	data->data_offset = cpu_to_le32(req->data_sent);
> +	data->data_offset = pdu->r2t_offset;

Shouldn't this be "le32_to_cpu(pdu->r2t_offset)"?

>  	data->data_length = cpu_to_le32(req->pdu_len);
>  	return 0;
>  }
> @@ -939,7 +939,15 @@ static int nvme_tcp_try_send_data(struct nvme_tcp_request *req)
>  			nvme_tcp_ddgst_update(queue->snd_hash, page,
>  					offset, ret);
>  
> -		/* fully successful last write*/
> +		/*
> +		 * update the request iterator except for the last payload send
> +		 * in the request where we don't want to modify it as we may
> +		 * compete with the RX path completing the request.
> +		 */
> +		if (req->data_sent + ret < req->data_len)
> +			nvme_tcp_advance_req(req, ret);
> +
> +		/* fully successful last send in current PDU */
>  		if (last && ret == len) {
>  			if (queue->data_digest) {
>  				nvme_tcp_ddgst_final(queue->snd_hash,
> @@ -951,7 +959,6 @@ static int nvme_tcp_try_send_data(struct nvme_tcp_request *req)
>  			}
>  			return 1;
>  		}
> -		nvme_tcp_advance_req(req, ret);
>  	}
>  	return -EAGAIN;
>  }
> -- 
> 2.30.2



More information about the Linux-nvme mailing list