[PATCH 2/2] nvmet-rdma: implement get_queue_size controller op

Wed Sep 22 02:35:45 PDT 2021

On 9/22/2021 12:18 PM, Sagi Grimberg wrote:
>
>>>> So for now, as mentioned, till we have some ib_ API, lets set it to 
>>>> 128.
>>> Please just add the proper ib_ API, it should not be a whole lot of
>>> work as we already do that calculation anyway for the R/W API setup.
>>
>> We don't do this exact calculation since only the low level driver 
>> knows that number of WQEs we need for some sophisticated WR.
>>
>> The API we need is like ib_get_qp_limits when one provides some input 
>> on the operations it will issue and will receive an output for it.
>>
>> Then we need to divide it by some factor that will reflect the amount 
>> of max WRs per NVMe request (e.g mem_reg + mem_invalidation + rdma_op 
>> + pi_yes_no).
>>
>> I spoke with Jason on that and we decided that it's not a trivial patch.
>
> Can't you do this in rdmw_rw? all of the users of it will need the
> exact same value right?

The factor of the operations per IO req is to be added to RW API.

The factor of WR to WQE is in low level driver and is ib_ API.

>
>> is it necessary for this submission or can we live with 128 depth for 
>> now ? with and without new ib_ API the queue depth will be in these 
>> sizes.
>
> I am not sure I see the entire complexity. Even if this calc is not
> accurate, you are already proposing to hard-code it to 128, so you
> can do this to account for the boundaries there.

How does the ULP know the BBs per max WR operation ?

I prepared a patch to solve the case we say we support X but we actually 
support less than X.

128 Value is supported by mlx device and I assume by other RDMA devices 
as well since its the default value for the initiator.

The full solution include changes in RDMA_RW, ib_, low level drivers to 
implement ib_ API.

I wanted to divide it to early solution (this series) and full solution 
(the above).