[PATCH 2/2] nvmet-rdma: implement get_queue_size controller op

Wed Sep 22 06:31:19 PDT 2021

On Wed, Sep 22, 2021 at 03:57:17PM +0300, Max Gurtovoy wrote:
> 
> On 9/22/2021 3:10 PM, Jason Gunthorpe wrote:
> > On Wed, Sep 22, 2021 at 12:18:15PM +0300, Sagi Grimberg wrote:
> > 
> > > Can't you do this in rdmw_rw? all of the users of it will need the
> > > exact same value right?
> > No, it depends on what ops the user is going to use.
> > > > is it necessary for this submission or can we live with 128 depth for
> > > > now ? with and without new ib_ API the queue depth will be in these
> > > > sizes.
> > > I am not sure I see the entire complexity. Even if this calc is not
> > > accurate, you are already proposing to hard-code it to 128, so you
> > > can do this to account for the boundaries there.
> > As I understood it the 128 is to match what the initiator hardcodes
> > its limit to - both sides have the same basic problem with allocating
> > the RDMA QP, they just had different hard coded limits. Due to this we
> > know that 128 is OK for all RDMA HW as the initiator has proven it
> > already.
> 
> Not exactly. The initiator 128 is the default value if not set differently
> in the connect command.
> 
> Probably this value can be bigger in initiator since it doesn't perform RDMA
> operation but only sends descriptors to the target.

Well, that means the initiator side needs fixing too. I see this:

			if (token < NVMF_MIN_QUEUE_SIZE ||
			    token > NVMF_MAX_QUEUE_SIZE)
                            ERR
			opts->queue_size = token;

Which is probably still too big for what some HW can do.

Both host and target need to bring in an upper limit of queue_size
from the RDMA layer. A ULP should not pass in a value to
ib_qp_init_attr::max_send_wr that will cause QP creation to fail if
the queue_size is programmable.

Currently there is no way to to get the device limit for QPs using
IB_QP_CREATE_INTEGRITY_EN. We know at least that 128 works on all RDMA
devices.

In any case I still view it as two tasks, fix the various interop
problems by adjusting the current hardwired limits to something that
works on all RDMA HW and computing the actual HW limit, adjusted by
RW, etc.

Jason