nvme tcp receive errors
Sagi Grimberg
sagi at grimberg.me
Tue May 4 19:15:28 BST 2021
>>>>>>> Hey Keith,
>>>>>>>
>>>>>>> Did this resolve the issues?
>>>>>>
>>>>>> We're unfortunately still observing data digest issues even with this.
>>>>>> Most of the testing has shifted to the r2t error, so I don't have any
>>>>>> additional details on the data digest problem.
>>>>>
>>>>> I've looked again at the code, and I'm not convinced that the patch
>>>>> is needed at all anymore, I'm now surprised that it actually changed
>>>>> anything (disregarding data digest).
>>>>>
>>>>> The driver does not track the received bytes by definition, it relies
>>>>> on the controller to send it a completion, or set the success flag in
>>>>> the _last_ c2hdata pdu. Does your target set
>>>>> NVME_TCP_F_DATA_SUCCESS on any of the c2hdata pdus?
>>>>
>>>> Perhaps you can also run this patch instead?
>>>
>>> Thanks, will give this a shot.
>>
>> Still would be beneficial to look at the traces and check if
>> the success flag happens to be set. If this flag is set, the
>> driver _will_ complete the request without checking the bytes
>> received thus far (similar to how pci and rdma don't and can't
>> check dma byte count).
>
> I realized this patch is the same as one you'd sent earlier. We hit the
> BUG_ON(), and then proceeded to use your follow-up patch, which appeared
> to fix the data receive problem, but introduced data digest problems.
>
> So, are you saying that hitting this BUG_ON means that the driver has
> observed the completion out-of-order from the expected data?
If you hit the BUG_ON it means that the host spotted a c2hdata
PDU that has the success flag set before all the request data
was received:
--
@@ -759,6 +761,7 @@ static int nvme_tcp_recv_data(struct nvme_tcp_queue
*queue, struct sk_buff *skb,
queue->ddgst_remaining = NVME_TCP_DIGEST_LENGTH;
} else {
if (pdu->hdr.flags & NVME_TCP_F_DATA_SUCCESS) {
+ BUG_ON(req->data_received != req->data_len);
nvme_tcp_end_request(rq, NVME_SC_SUCCESS);
queue->nr_cqe++;
}
--
Which means that the host is completing the request immediately relying
on the controller sent all the required data and knowing that a
completion response capsule will not be sent.
From the spec:
C2HData PDUs contain a LAST_PDU flag that is set to ‘1’ in the last PDU
of a command data transfer and is cleared to ‘0’ in all other C2HData
PDUs associated with the command. C2HData PDUs also contain a SUCCESS
flag that may be set to ‘1’ in the last C2HData PDU of a command data
transfer to indicate that the command has completed successfully. In
this case, no Response Capsule is sent by the controller for the command
and the host synthesizes a completion queue entry for the command with
the Command Specific field and the Status Field both cleared to 0h. If
the SUCCESS flag is cleared to ‘0’ in the last C2HData PDU of a command,
then the controller shall send a Response Capsule for the command to the
host. The SUCCESS flag shall be cleared to ‘0’ in all C2HData PDUs that
are not the last C2HData PDU for a command. The SUCCESS flag may be set
to ‘1’ in the last C2HData PDU only if the controller supports disabling
submission queue head pointer updates.
Hence my question, does the controller set NVME_TCP_F_DATA_SUCCESS in
any of the c2hdata PDUs which is not the last one? does it set it in the
last one and omitting the cqe response capsule as expected by the host?
More information about the Linux-nvme
mailing list