can not load more than 51 nvme rdma device
李春
pickup112 at gmail.com
Thu May 17 00:50:31 PDT 2018
another question:
And how can we get how many MRs left in the Infiniband CA.
李春 <pickup112 at gmail.com> 于2018年5月17日周四 下午3:15写道:
> Thanks for you help.
> > Any reason for this type of 1:1 configuration ?
> > Can you expose all 100 disks using 1 subsystem or 10 disks per
subsystem ?
> > Do you understand the difference in resource allocation between both
> cases ?
> Our actual production environment will use IB switches for interconnect in
> a cluster. And in the actual production environment, we may also have
> hundreds of disks output to the same node.
> After we found this problem with io_queue, we recreated the problem in a
> test environment with two nodes directly connected.
> We know the difference you mentioned, but in the actual production
> environment, we do have more than 51 targets loaded. So want to understand
> thoroughly why this restriction occurs and whether there is a way to get
> around this limitation
> > try to use --queue-size=16 in your conect command.
> > You don't realy need so many resources (10 io queues with 128 queue-size
> > each) to saturate 56Gb wire.
> How can we turn the size of nr-io-queues or queue-size. Is there a best
> practice for our scenario?
> Increase or decrease the queue-size/nr-io-queues will impact what?
> so if nr-io-queues * queue-size * nvme_connect_number> max_mr, we will
meet
> the error.
> > This is because you try to allocate more MRs that the maximum support of
> > the device.
> > In NVNe/RDMA We create "queue-size" number of MRs per each created IO
> queue.
> How can we know the maximum MRs support by the device?
> Do you mean that MR refers to Memory Region?
> > The max_mr for this adapter is much bigger.
> > If the above solutions are not enough, then we can dig-in more to low
> > level drivers...
> According to what you said above, max_mr is just the hardware attribute,
> not related to Linux, nvme, rdma, is't it?
> Whether max_mr is the attribute of a network card or a port on a network
> card? according to my test on this side, after a port of a network card
> reports an error on the dual port network card, another port can continue
> to load a new target.
> Can you suggest any information or document which introduce relationship
> between queue-size, nr_io_queues and MRs.
> --
> pickup.lichun 李春
--
pickup.lichun 李春
More information about the Linux-nvme
mailing list