Unexpected issues with 2 NVME initiators using the same target

Max Gurtovoy maxg at mellanox.com
Tue Jun 20 03:31:48 PDT 2017



On 6/20/2017 12:33 PM, Sagi Grimberg wrote:
>
>>>> Here the parsed output, it says that it was access to mkey which is
>>>> free.
>
> Missed that :)
>
>>>> ======== cqe_with_error ========
>>>> wqe_id                           : 0x0
>>>> srqn_usr_index                   : 0x0
>>>> byte_cnt                         : 0x0
>>>> hw_error_syndrome                : 0x93
>>>> hw_syndrome_type                 : 0x0
>>>> vendor_error_syndrome            : 0x52
>>>
>>> Can you share the check that correlates to the vendor+hw syndrome?
>>
>> mkey.free == 1
>
> Hmm, the way I understand it is that the HW is trying to access
> (locally via send) a MR which was already invalidated.
>
> Thinking of this further, this can happen in a case where the target
> already completed the transaction, sent SEND_WITH_INVALIDATE but the
> original send ack was lost somewhere causing the device to retransmit
> from the MR (which was already invalidated). This is highly unlikely
> though.
>
> Shouldn't this be protected somehow by the device?
> Can someone explain why the above cannot happen? Jason? Liran? Anyone?
>
> Say host register MR (a) and send (1) from that MR to a target,
> send (1) ack got lost, and the target issues SEND_WITH_INVALIDATE
> on MR (a) and the host HCA process it, then host HCA timeout on send (1)
> so it retries, but ehh, its already invalidated.

This might happen IMO.
Robert, can you test this untested patch (this is not the full solution, 
just something to think about):

diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c 
b/drivers/infiniband/ulp/iser/iser_verbs.c
index c538a38..e93bd40 100644
--- a/drivers/infiniband/ulp/iser/iser_verbs.c
+++ b/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -1079,7 +1079,7 @@ int iser_post_send(struct ib_conn *ib_conn, struct 
iser_tx_desc *tx_desc,
         wr->sg_list = tx_desc->tx_sg;
         wr->num_sge = tx_desc->num_sge;
         wr->opcode = IB_WR_SEND;
-       wr->send_flags = signal ? IB_SEND_SIGNALED : 0;
+       wr->send_flags = IB_SEND_SIGNALED;

         ib_ret = ib_post_send(ib_conn->qp, &tx_desc->wrs[0].send, &bad_wr);
         if (ib_ret)

>
> Or, we can also have a race where we destroy all our MRs when I/O
> is still running (but from the code we should be safe here).
>
> Robert, when you rebooted the target, I assume iscsi ping
> timeout expired and the connection teardown started correct?



More information about the Linux-nvme mailing list