[PATCH 2/2] nvmet-tcp: fix connect error when setting param_inline_data_size to zero.
Sagi Grimberg
sagi at grimberg.me
Thu May 20 15:44:13 PDT 2021
> When setting inline_data_size to zero, connect failed. This could be
> reproduced with following steps.
>
> Controller side:
> mkdir /sys/kernel/config/nvmet/ports/1
> cd /sys/kernel/config/nvmet/ports/1
> echo 0.0.0.0 > addr_traddr
> echo 4421 > addr_trsvcid
> echo ipv4 > addr_adrfam
> echo tcp > addr_trtype
> echo 0 > param_inline_data_size
> ln -s /sys/kernel/config/nvmet/subsystems/mysub /sys/kernel/config/nvmet/ports/1/subsystems/mysub
>
> Host side:
> [ 325.145323][ T203] nvme nvme1: Connect command failed, error wo/DNR bit: 22
> [ 325.159481][ T203] nvme nvme1: failed to connect queue: 0 ret=16406
> Failed to write to /dev/nvme-fabrics: Input/output error
>
> Kernel log from controller side is:
> [ 114.567411][ T56] nvmet_tcp: queue 0: failed to map data
> [ 114.568093][ T56] nvmet_tcp: unexpected pdu type 201
>
> When admin-connect command comes with 1024 inline data size, in nvmet_tcp_map_data(),
> this size is compared with cmd->req.port->inline_data_size (which is 0),
> thus the command is responded with an error code. But admin-connect command
> is always allowed to use no more than 8192 bytes according to the nvme over
> fabrics specification.
>
> The host side decides the inline data size when allocating the queue
> according the queue number, the size of queue 0 is 8k and others is ioccsz*16.
> The target side should do like this.
>
> Fixes: 0d5ee2b2ab4f ("nvmet-rdma: support max(16KB, PAGE_SIZE) inline data")
> Signed-off-by: Hou Pu <houpu.main at gmail.com>
> ---
> drivers/nvme/target/tcp.c | 24 +++++++++++++++++++++---
> 1 file changed, 21 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
> index d8aceef83284..83985ab8c3aa 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -167,6 +167,24 @@ static const struct nvmet_fabrics_ops nvmet_tcp_ops;
> static void nvmet_tcp_free_cmd(struct nvmet_tcp_cmd *c);
> static void nvmet_tcp_finish_cmd(struct nvmet_tcp_cmd *cmd);
>
> +static inline int nvmet_tcp_inline_data_size(struct nvmet_tcp_cmd *cmd)
> +{
> + struct nvmet_tcp_queue *queue = cmd->queue;
> + struct nvme_command *nvme_cmd = cmd->req.cmd;
> + int inline_data_size = NVME_TCP_ADMIN_CCSZ;
> + u16 qid = 0;
> +
> + if (likely(queue->nvme_sq.ctrl)) {
> + /* The connect admin/io queue has been executed. */
> + qid = queue->nvme_sq.qid;
> + if (qid)
> + inline_data_size = cmd->req.port->inline_data_size;
> + } else if (nvme_cmd->connect.qid)
> + inline_data_size = cmd->req.port->inline_data_size;
How can a connection to an I/O queue arrive without having the ctrl
reference installed? Is this for the failure case?
More information about the Linux-nvme
mailing list