Unexpected issues with 2 NVME initiators using the same target

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Mon Jul 10 14:32:51 PDT 2017


On Mon, Jul 10, 2017 at 05:29:53PM -0400, Chuck Lever wrote:
> 
> > On Jul 10, 2017, at 5:24 PM, Jason Gunthorpe <jgunthorpe at obsidianresearch.com> wrote:
> > 
> > On Mon, Jul 10, 2017 at 03:03:18PM -0400, Chuck Lever wrote:
> > 
> >>>> Or I could revert all the "map page cache pages" logic and
> >>>> just use memcpy for small NFS WRITEs, and RDMA the rest of
> >>>> the time. That keeps everything simple, but means large
> >>>> inline thresholds can't use send-in-place.
> >>> 
> >>> Don't you have the same problem with RDMA WRITE?
> >> 
> >> The server side initiates RDMA Writes. The final RDMA Write in a WR
> >> chain is signaled, but a subsequent Send completion is used to
> >> determine when the server may release resources used for the Writes.
> >> We're already doing it the slow way there, and there's no ^C hazard
> >> on the server.
> > 
> > Wait, I guess I meant RDMA READ path.
> > 
> > The same contraints apply to RKeys as inline send - you cannot DMA
> > unmap rkey memory until the rkey is invalidated at the HCA.
> > 
> > So posting an invalidate SQE and then immediately unmapping the DMA
> > pages is bad too..
> > 
> > No matter how the data is transfered the unmapping must follow the
> > same HCA synchronous model.. DMA unmap must only be done from the send
> > completion handler (inline send or invalidate rkey), from the recv
> > completion handler (send with invalidate), or from QP error state teardown.
> > 
> > Anything that does DMA memory unmap from another thread is very, very
> > suspect, eg async from a ctrl-c trigger event.
> 
> 4.13 server side is converted to use the rdma_rw API for
> handling RDMA Read. For non-iWARP cases, it's using the
> local DMA key for Read sink buffers. For iWARP it should
> be using Read-with-invalidate (IIRC).

The server sounds fine, how does the client work?

Jason



More information about the Linux-nvme mailing list