Unexpected issues with 2 NVME initiators using the same target

Mon Jul 10 13:51:20 PDT 2017

> On Jul 10, 2017, at 4:05 PM, Jason Gunthorpe <jgunthorpe at obsidianresearch.com> wrote:
> 
> On Mon, Jul 10, 2017 at 03:03:18PM -0400, Chuck Lever wrote:
> 
>> One option is to somehow split the Send-related data structures from
>> rpcrdma_req, and manage them independently. I've already done that for
>> MRs: MR state is now located in rpcrdma_mw.
> 
> Yes, this is is what I was implying.. Track the SQE related stuff
> seperately in memory allocated during SQ setup - MR, dma maps, etc.

> No need for an atomic/lock then, right? The required memory is bounded
> since the inline send depth is bounded.

Perhaps I lack some imagination, but I don't see how I can manage
these small objects without a serialized free list or circular
array that would be accessed in the forward path and also in a
Send completion handler.

And I would still have to signal all Sends, which is extra
interrupts and context switches.

This seems like a lot of overhead to deal with a very uncommon
case. I can reduce this overhead by signaling only Sends that
need to unmap page cache pages, but still.

But I also realized that Send Queue accounting can be broken by a
delayed Send completion.

As we previously discussed, xprtrdma does SQ accounting using RPC
completion as the gate. Basically xprtrdma will send another RPC
as soon as a previous one is terminated. If the Send WR is still
running when the RPC terminates, I can potentially overrun the
Send Queue.

--
Chuck Lever