can not load more than 51 nvme rdma device

Thu May 17 00:50:31 PDT 2018

another question：
And how can we get how many MRs left in the Infiniband CA.
李春 <pickup112 at gmail.com> 于2018年5月17日周四 下午3:15写道：

> Thanks for you help.

> > Any reason for this type of 1:1 configuration ?
> > Can you expose all 100 disks using 1 subsystem or 10 disks per
subsystem ?
> > Do you understand the difference in resource allocation between both
> cases ?

> Our actual production environment will use IB switches for interconnect in
> a cluster. And in the actual production environment, we may also have
> hundreds of disks output to the same node.
> After we found this problem with io_queue, we recreated the problem in a
> test environment with two nodes directly connected.
> We know the difference you mentioned, but in the actual production
> environment, we do have more than 51 targets loaded. So want to understand
> thoroughly why this restriction occurs and whether there is a way to get
> around this limitation

> > try to use --queue-size=16 in your conect command.
> > You don't realy need so many resources (10 io queues with 128 queue-size
> > each) to saturate 56Gb wire.

> How can we turn the size of nr-io-queues or queue-size. Is there a best
> practice for our scenario?
> Increase or decrease the queue-size/nr-io-queues will impact what?
> so if nr-io-queues * queue-size * nvme_connect_number> max_mr, we will
meet
> the error.

> > This is because you try to allocate more MRs that the maximum support of
> > the device.
> > In NVNe/RDMA We create "queue-size" number of MRs per each created IO
> queue.

> How can we know the maximum MRs  support by the device?
> Do you mean that MR refers to Memory Region?

> > The max_mr for this adapter is much bigger.
> > If the above solutions are not enough, then we can dig-in more to low
> > level drivers...

> According to what you said above, max_mr is just the hardware attribute,
> not related to Linux, nvme, rdma, is't it?

> Whether max_mr is the attribute of a network card or a port on a network
> card? according to my test on this side, after a port of a network card
> reports an error on the dual port network card, another port can continue
> to load a new target.

> Can you suggest any information or document  which introduce relationship
> between queue-size, nr_io_queues and MRs.

> --
> pickup.lichun 李春

-- 
pickup.lichun 李春