[RFC PATCH V2 2/2] nvme: rdma: use ib_device's max_qp_wr to limit sqsize

Sagi Grimberg sagi at grimberg.me
Wed Dec 20 11:27:46 PST 2023


>>> @@ -1030,11 +1030,13 @@ static int nvme_rdma_setup_ctrl(struct 
>>> nvme_rdma_ctrl *ctrl, bool new)
>>>               ctrl->ctrl.opts->queue_size, ctrl->ctrl.sqsize + 1);
>>>       }
>>> -    if (ctrl->ctrl.sqsize + 1 > NVME_RDMA_MAX_QUEUE_SIZE) {
>>> +    ib_max_qsize = ctrl->device->dev->attrs.max_qp_wr /
>>> +            (NVME_RDMA_SEND_WR_FACTOR + 1);
>>
>> rdma_dev_max_qsize is a better name.
>>
>> Also, you can drop the RFC for the next submission.
>>
> 
> Sagi,
> I don't feel comfortable with these patches.

Well, good that you're speaking up then ;)

> First I would like to understand the need for it.

I assumed that he stumbled on a device that did not support the
existing max of 128 nvme commands (which is 384 rdma wrs for the qp).

> Second, the QP WR can be constructed from one or more WQEs and the WQEs 
> can be constructed from one or more WQEBBs. The max_qp_wr doesn't take 
> it into account.

Well, it is not taken into account now either with the existing magic
limit in nvmet. The rdma limits reporting mechanism was and still is
unusable.

I would expect a device that has different size for different work
items to report max_qp_wr accounting for the largest work element that
the device supports, so it is universally correct.

The fact that max_qp_wr means the maximum number of slots is a qp and
at the same time different work requests can arbitrarily use any number
of slots without anyone ever knowing, makes it pretty much impossible to
use reliably.

Maybe rdma device attributes need a new attribute called
universal_max_qp_wr that is going to actually be reliable and not
guess-work?



More information about the Linux-nvme mailing list