[PATCH 2/3] nvmet-rdma: use SRQ per completion vector

Thu Sep 7 03:47:08 PDT 2017

On 9/6/2017 5:50 PM, Sagi Grimberg wrote:
> 
>> In order to save resource allocation and utilize the completion
>> locality in a better way, allocate Shared Receive Queues (SRQs) per
>> completion vector (and not per device).
> 
> Something is backwards here, srq per vector is not saving resources
> compared to srq per-device, maybe compared to normal receive queues
> (if we have enough initiators).

I meant compared to the normal MQ state.
Also I added a module param that one can control the amount of resource 
allocations.

> 
> And what do you mean by "utilize the completion locality in a better
> way"? How is using srq affecting a completion queue locality?

I meant that for each completion that arrives on CPU N the "local" SRQ_N 
will be the one that will post_recv (and lock the srq spin_lock) the buffer.
In SRQ per device we'll get completions from all CPU's and a single SRQ 
will be "working" on all the CPU's.

maybe we can rephrase it.

> 
>> Assosiate each created QP/CQ
> 
> associate

:)

> 
>> with an appropriate SRQ according to the queue index. This association
>> will reduce the lock contention in the fast path
> 
> It reduces lock contention compared to srq-per-device, not normal
> receive queues.

of course.

> 
> 
>> and increase the locality in memory buffers.
> 
> How does it increase locality in memory buffers?

We are sharing buffers between many connections (compared to normal MQ). 
lets say that many initiatores are running traffic. In that case the 
shared buffers will be "hot" much more than the case that each resource 
will be used by 1 consumer. I assume the MM unit can recognize this 
situation.