[PATCH RFC 2/4] nvme-rdma: fix sqsize/hsqsize/hrqsize per spec
Sagi Grimberg
sagi at grimberg.me
Thu Aug 11 00:03:36 PDT 2016
On 11/08/16 07:07, Jay Freyensee wrote:
> Per NVMe-over-Fabrics 1.0 spec, sqsize is represented as
> a 0-based value.
>
> Also per spec, the RDMA binding values shall be set
> to sqsize, which makes hsqsize 0-based values.
>
> Also per spec, but not very clear, is hrqsize is +1
> of hsqsize.
>
> Thus, the sqsize during NVMf connect() is now:
>
> [root at fedora23-fabrics-host1 for-48]# dmesg
> [ 318.720645] nvme_fabrics: nvmf_connect_admin_queue(): sqsize for
> admin queue: 31
> [ 318.720884] nvme nvme0: creating 16 I/O queues.
> [ 318.810114] nvme_fabrics: nvmf_connect_io_queue(): sqsize for i/o
> queue: 127
>
> Reported-by: Daniel Verkamp <daniel.verkamp at intel.com>
> Signed-off-by: Jay Freyensee <james_p_freyensee at linux.intel.com>
> ---
> drivers/nvme/host/rdma.c | 19 ++++++++++++++++---
> 1 file changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 3e3ce2b..8be64f1 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -1284,8 +1284,21 @@ static int nvme_rdma_route_resolved(struct nvme_rdma_queue *queue)
>
> priv.recfmt = cpu_to_le16(NVME_RDMA_CM_FMT_1_0);
> priv.qid = cpu_to_le16(nvme_rdma_queue_idx(queue));
> - priv.hrqsize = cpu_to_le16(queue->queue_size);
> - priv.hsqsize = cpu_to_le16(queue->queue_size);
> +
> + /*
> + * On one end, the fabrics spec is pretty clear that
> + * hsqsize variables shall be set to the value of sqsize,
> + * which is a 0-based number. What is confusing is the value for
> + * hrqsize. After clarification from NVMe spec committee member,
> + * the minimum value of hrqsize is hsqsize+1.
> + */
> + if (priv.qid == 0) {
> + priv.hsqsize = cpu_to_le16(queue->ctrl->ctrl.admin_sqsize);
> + priv.hrqsize = cpu_to_le16(queue->ctrl->ctrl.admin_sqsize+1);
> + } else {
> + priv.hsqsize = cpu_to_le16(queue->ctrl->ctrl.sqsize);
> + priv.hrqsize = cpu_to_le16(queue->ctrl->ctrl.sqsize+1);
> + }
Huh? (scratch...) using priv.hrqsize = priv.hsqsize+1 is pointless.
We expose to the block layer X and we send to the target X-1 and
the target does X+1 (looks goofy, but ok). We also size our RDMA
send/recv according to X so why on earth would we want to tell the
target we have a recv queue of size X+1
More information about the Linux-nvme
mailing list