[PATCH v2] nvmet-tcp: fix a segmentation fault during io parsing error

Sun Mar 28 06:13:27 BST 2021

Hi Hou

Even with the code below we will still queue the failed REQ for the first time.

Please mind the code in nvmet_tcp_done_recv_pdu:
1. nvmet_req_init will call __nvmet_req_complete with an error status
1.1. __nvmet_req_complete will call nvmet_tcp_queue_response
2. nvmet_tcp_handle_req_failure will set the flag NVMET_TCP_F_INIT_FAILED

In step 1.1, the failed REQ will be added to the queue anyway because the flag wasn't set yet.

In the previous patch I was adding the REQ back to the queue on nvmet_tcp_fetch_cmd, so it was useful to prevent queueing the REQ over and over again.
Now I removed that part from the patch, so it means that the REQ will be added only once.

Elad

-----Original Message-----
From: Hou Pu <houpu.main at gmail.com> 
Sent: Sunday, 28 March 2021 7:04
To: Grupi, Elad
Cc: linux-nvme at lists.infradead.org; sagi at grimberg.me; houpu.main at gmail.com
Subject: RE: [PATCH v2] nvmet-tcp: fix a segmentation fault during io parsing error


[EXTERNAL EMAIL] 

On Date: Fri, 26 Mar 2021 20:59:41 +0300, Elad Grupi wrote:
> In case there is an io that contains inline data and it goes to 
> parsing error flow, command response will free command and iov before 
> clearing the data on the socket buffer.
> This will delay the command response until receive flow is completed.
> 
> Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver")
> Signed-off-by: Elad Grupi <elad.grupi at dell.com>
> ---
>  drivers/nvme/target/tcp.c | 29 +++++++++++++++++++++++------
>  1 file changed, 23 insertions(+), 6 deletions(-)

Hi Elad,
Just noticed that we still queue the failed REQ to the response queue in this version. Am I missing something?

Is this still needed? (I think it is).
@@ -526,6 +527,12 @@ static void nvmet_tcp_queue_response(struct nvmet_req *req)
 		container_of(req, struct nvmet_tcp_cmd, req);
 	struct nvmet_tcp_queue	*queue = cmd->queue;
 
+	if (unlikely((cmd->flags & NVMET_TCP_F_INIT_FAILED) &&
+			nvmet_tcp_has_inline_data(cmd))) {
+		/* fail the cmd when we finish processing the inline data */
+		return;
+	}
+


Thanks,
Hou

> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c 
> index 70cc507d1565..3bc0caa47873 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -702,6 +702,17 @@ static int nvmet_tcp_try_send_one(struct nvmet_tcp_queue *queue,
>  			return 0;
>  	}
> 
> +	if (unlikely((cmd->flags & NVMET_TCP_F_INIT_FAILED) &&
> +			nvmet_tcp_has_data_in(cmd) &&
> +			nvmet_tcp_has_inline_data(cmd))) {
> +		/*
> +		 * wait for inline data before processing the response
> +		 * so the iov will not be freed
> +		 */
> +		queue->snd_cmd = NULL;
> +		goto done_send;
> +	}
> +
>  	if (cmd->state == NVMET_TCP_SEND_DATA_PDU) {
>  		ret = nvmet_try_send_data_pdu(cmd);
>  		if (ret <= 0)
> @@ -1103,9 +1114,11 @@ static int nvmet_tcp_try_recv_data(struct nvmet_tcp_queue *queue)
>  		return 0;
>  	}
> 
> -	if (!(cmd->flags & NVMET_TCP_F_INIT_FAILED) &&
> -	    cmd->rbytes_done == cmd->req.transfer_len) {
> -		cmd->req.execute(&cmd->req);
> +	if (cmd->rbytes_done == cmd->req.transfer_len) {
> +		if (unlikely(cmd->flags & NVMET_TCP_F_INIT_FAILED))
> +			nvmet_tcp_queue_response(&cmd->req);
> +		else
> +			cmd->req.execute(&cmd->req);
>  	}
> 
>  	nvmet_prepare_receive_pdu(queue);
> @@ -1143,9 +1156,13 @@ static int nvmet_tcp_try_recv_ddgst(struct nvmet_tcp_queue *queue)
>  		goto out;
>  	}
> 
> -	if (!(cmd->flags & NVMET_TCP_F_INIT_FAILED) &&
> -	    cmd->rbytes_done == cmd->req.transfer_len)
> -		cmd->req.execute(&cmd->req);
> +	if (cmd->rbytes_done == cmd->req.transfer_len) {
> +		if (unlikely(cmd->flags & NVMET_TCP_F_INIT_FAILED))
> +			nvmet_tcp_queue_response(&cmd->req);
> +		else
> +			cmd->req.execute(&cmd->req);
> +	}
> +
>  	ret = 0;
>  out:
>  	nvmet_prepare_receive_pdu(queue);