[PATCH] nvme-tcp: Fix H2CData PDU send accounting (again)

Keith Busch kbusch at kernel.org
Mon Oct 25 07:38:38 PDT 2021


On Sun, Oct 24, 2021 at 10:43:31AM +0300, Sagi Grimberg wrote:
> We should not access request members after the last send, even to
> determine if indeed it was the last data payload send. The reason is
> that a completion could have arrived and trigger a new execution of the
> request which overridden these members. This was fixed by commit
> 825619b09ad3 ("nvme-tcp: fix possible use-after-completion").
> 
> Commit e371af033c56 broke that assumption again to address cases where
> multiple r2t pdus are sent per request. To fix it, we need to record the
> request data_sent and data_len and after the payload network send we
> reference these counters to determine weather we should advance the
> request iterator.
> 
> Fixes: e371af033c56 ("nvme-tcp: fix incorrect h2cdata pdu offset accounting")
> Reported-by: Keith Busch <kbusch at kernel.org>
> Cc: stable at vger.kernel.org # 5.10+ 
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>  drivers/nvme/host/tcp.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 78966b8ddb1e..42b58eb1ba62 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -926,15 +926,17 @@ static void nvme_tcp_fail_request(struct nvme_tcp_request *req)
>  static int nvme_tcp_try_send_data(struct nvme_tcp_request *req)
>  {
>  	struct nvme_tcp_queue *queue = req->queue;
> +	int req_data_len = req->data_len;
>  
>  	while (true) {
>  		struct page *page = nvme_tcp_req_cur_page(req);
>  		size_t offset = nvme_tcp_req_cur_offset(req);
>  		size_t len = nvme_tcp_req_cur_length(req);
> -		bool last = nvme_tcp_pdu_last_send(req, len);
> +		bool pdu_last = nvme_tcp_pdu_last_send(req, len);

I'm not sure why you opted to change the name of this variable here, but
the patch is testing successfully: no observed r2t errors since apply
this, so I think we're good to go for that regression.

Reviewed-by: Keith Busch <kbusch at kernel.org>

> +		int req_data_sent = req->data_sent;
>  		int ret, flags = MSG_DONTWAIT;
>  
> -		if (last && !queue->data_digest && !nvme_tcp_queue_more(queue))
> +		if (pdu_last && !queue->data_digest && !nvme_tcp_queue_more(queue))
>  			flags |= MSG_EOR;
>  		else
>  			flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST;
> @@ -958,11 +960,11 @@ static int nvme_tcp_try_send_data(struct nvme_tcp_request *req)
>  		 * in the request where we don't want to modify it as we may
>  		 * compete with the RX path completing the request.
>  		 */
> -		if (req->data_sent + ret < req->data_len)
> +		if (req_data_sent + ret < req_data_len)
>  			nvme_tcp_advance_req(req, ret);
>  
>  		/* fully successful last send in current PDU */
> -		if (last && ret == len) {
> +		if (pdu_last && ret == len) {
>  			if (queue->data_digest) {
>  				nvme_tcp_ddgst_final(queue->snd_hash,
>  					&req->ddgst);
> -- 
> 2.30.2



More information about the Linux-nvme mailing list