[RFC PATCH 3/3] nvme: rdma: use ib_device's max_qp_wr to limit sqsize

Mon Dec 18 04:31:32 PST 2023

在 2023/12/18 19:57, Sagi Grimberg 写道:
>
>
> On 12/18/23 13:05, Guixin Liu wrote:
>> Currently, the host is limited to creating queues with a depth of
>> 128. To enable larger queue sizes, constrain the sqsize based on
>> the ib_device's max_qp_wr capability.
>>
>> And also remove unused NVME_RDMA_MAX_QUEUE_SIZE macro.
>>
>> Signed-off-by: Guixin Liu <kanie at linux.alibaba.com>
>> ---
>>   drivers/nvme/host/rdma.c  | 14 ++++++++------
>>   include/linux/nvme-rdma.h |  2 --
>>   2 files changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>> index 81e2621..982f3e4 100644
>> --- a/drivers/nvme/host/rdma.c
>> +++ b/drivers/nvme/host/rdma.c
>> @@ -489,8 +489,7 @@ static int nvme_rdma_create_cq(struct ib_device 
>> *ibdev,
>>   static int nvme_rdma_create_queue_ib(struct nvme_rdma_queue *queue)
>>   {
>>       struct ib_device *ibdev;
>> -    const int send_wr_factor = 3;            /* MR, SEND, INV */
>> -    const int cq_factor = send_wr_factor + 1;    /* + RECV */
>> +    const int cq_factor = NVME_RDMA_SEND_WR_FACTOR + 1;    /* + RECV */
>>       int ret, pages_per_mr;
>>         queue->device = nvme_rdma_find_get_device(queue->cm_id);
>> @@ -508,7 +507,7 @@ static int nvme_rdma_create_queue_ib(struct 
>> nvme_rdma_queue *queue)
>>       if (ret)
>>           goto out_put_dev;
>>   -    ret = nvme_rdma_create_qp(queue, send_wr_factor);
>> +    ret = nvme_rdma_create_qp(queue, NVME_RDMA_SEND_WR_FACTOR);
>>       if (ret)
>>           goto out_destroy_ib_cq;
>>   @@ -1006,6 +1005,7 @@ static int nvme_rdma_setup_ctrl(struct 
>> nvme_rdma_ctrl *ctrl, bool new)
>>   {
>>       int ret;
>>       bool changed;
>> +    int ib_max_qsize;
>>         ret = nvme_rdma_configure_admin_queue(ctrl, new);
>>       if (ret)
>> @@ -1030,11 +1030,13 @@ static int nvme_rdma_setup_ctrl(struct 
>> nvme_rdma_ctrl *ctrl, bool new)
>>               ctrl->ctrl.opts->queue_size, ctrl->ctrl.sqsize + 1);
>>       }
>>   -    if (ctrl->ctrl.sqsize + 1 > NVME_RDMA_MAX_QUEUE_SIZE) {
>> +    ib_max_qsize = ctrl->device->dev->attrs.max_qp_wr /
>> +            (NVME_RDMA_SEND_WR_FACTOR + 1);
>> +    if (ctrl->ctrl.sqsize + 1 > ib_max_qsize) {
>>           dev_warn(ctrl->ctrl.device,
>>               "ctrl sqsize %u > max queue size %u, clamping down\n",
>> -            ctrl->ctrl.sqsize + 1, NVME_RDMA_MAX_QUEUE_SIZE);
>> -        ctrl->ctrl.sqsize = NVME_RDMA_MAX_QUEUE_SIZE - 1;
>> +            ctrl->ctrl.sqsize + 1, ib_max_qsize);
>> +        ctrl->ctrl.sqsize = ib_max_qsize - 1;
>>       }
>
> This can be very very big, not sure why we should allow a queue of depth
> of a potentially giant size. We should also impose a hard limit, maybe
> align to the pci driver limit.

When we exec "nvme connect", the queue depth will be restricted to 
between 16 and 1024 in nvmf_parse_options(),

so this will not be very big, the max is 1024 in any case.