Unexpected issues with 2 NVME initiators using the same target

Sagi Grimberg sagi at grimberg.me
Tue Jun 20 10:12:50 PDT 2017


> So on occasion there is a Remote Access Error. That would
> trigger connection loss, and the retransmitted Send request
> is discarded (if there was externally exposed memory involved
> with the original transaction that is now invalid).

I'm actually not concerned about the remote invalidation, that
is good that its discarded or failed. Its the inline sends that
are a bug here.

> But the real problem is preventing retransmitted Sends from
> causing a ULP request to be executed multiple times.

Exactly.

>> Signalling all send completions and also finishing I/Os only after we
>> got them will add latency, and that sucks...
> 
> Typically, Sends will complete before the response arrives.
> The additional cost will be handling extra interrupts, IMO.

Not quite, heavy traffic _can_ results in dropped acks, my gut
feeling is that it can happen more than we suspect.

and yea, extra interrupt, extra cachelines, extra state,
but I do not see any other way around it.

> With FRWR, won't subsequent WRs be delayed until the HCA is
> done with the Send? I don't think a signal is necessary in
> every case. Send Queue accounting currently relies on that.

Not really, the Send after the FRWR might have a fence (not strong
ordered one) and CX3/CX4 strong order FRWR so for them that is a
non-issue. The problem is that ULPs can't rely on it.

> RPC-over-RDMA relies on the completion of Local Invalidation
> to ensure that the initial Send WR is complete.

Wait, is that guaranteed?

> For Remote
> Invalidation and pure inline, there is nothing to fence that
> Send.
> 
> The question I have is: how often do these Send retransmissions
> occur? Is it enough to have a robust recovery mechanism, or
> do we have to wire in assumptions about retransmission to
> every Send operation?

Even if its rare, we don't have any way to protect against devices
retrying the send operation.



More information about the Linux-nvme mailing list