[PATCH 2/2] nvmet-tcp: fix connect error when setting param_inline_data_size to zero.

Thu May 20 15:44:13 PDT 2021

> When setting inline_data_size to zero, connect failed. This could be
> reproduced with following steps.
> 
> Controller side:
> mkdir /sys/kernel/config/nvmet/ports/1
> cd /sys/kernel/config/nvmet/ports/1
> echo 0.0.0.0 > addr_traddr
> echo 4421 > addr_trsvcid
> echo ipv4 > addr_adrfam
> echo tcp > addr_trtype
> echo 0 > param_inline_data_size
> ln -s /sys/kernel/config/nvmet/subsystems/mysub /sys/kernel/config/nvmet/ports/1/subsystems/mysub
> 
> Host side:
> [  325.145323][  T203] nvme nvme1: Connect command failed, error wo/DNR bit: 22
> [  325.159481][  T203] nvme nvme1: failed to connect queue: 0 ret=16406
> Failed to write to /dev/nvme-fabrics: Input/output error
> 
> Kernel log from controller side is:
> [  114.567411][   T56] nvmet_tcp: queue 0: failed to map data
> [  114.568093][   T56] nvmet_tcp: unexpected pdu type 201
> 
> When admin-connect command comes with 1024 inline data size, in nvmet_tcp_map_data(),
> this size is compared with cmd->req.port->inline_data_size (which is 0),
> thus the command is responded with an error code. But admin-connect command
> is always allowed to use no more than 8192 bytes according to the nvme over
> fabrics specification.
> 
> The host side decides the inline data size when allocating the queue
> according the queue number, the size of queue 0 is 8k and others is ioccsz*16.
> The target side should do like this.
> 
> Fixes: 0d5ee2b2ab4f ("nvmet-rdma: support max(16KB, PAGE_SIZE) inline data")
> Signed-off-by: Hou Pu <houpu.main at gmail.com>
> ---
>   drivers/nvme/target/tcp.c | 24 +++++++++++++++++++++---
>   1 file changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
> index d8aceef83284..83985ab8c3aa 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -167,6 +167,24 @@ static const struct nvmet_fabrics_ops nvmet_tcp_ops;
>   static void nvmet_tcp_free_cmd(struct nvmet_tcp_cmd *c);
>   static void nvmet_tcp_finish_cmd(struct nvmet_tcp_cmd *cmd);
>   
> +static inline int nvmet_tcp_inline_data_size(struct nvmet_tcp_cmd *cmd)
> +{
> +	struct nvmet_tcp_queue *queue = cmd->queue;
> +	struct nvme_command *nvme_cmd = cmd->req.cmd;
> +	int inline_data_size = NVME_TCP_ADMIN_CCSZ;
> +	u16 qid = 0;
> +
> +	if (likely(queue->nvme_sq.ctrl)) {
> +		/* The connect admin/io queue has been executed. */
> +		qid = queue->nvme_sq.qid;
> +		if (qid)
> +			inline_data_size = cmd->req.port->inline_data_size;
> +	} else if (nvme_cmd->connect.qid)
> +		inline_data_size = cmd->req.port->inline_data_size;

How can a connection to an I/O queue arrive without having the ctrl
reference installed? Is this for the failure case?