[PATCH 3/3] nvme-tcp: fix I/O stalls on congested sockets

Wed Apr 23 10:26:30 PDT 2025

Sagi, Hannes,

>>> Can you please share the output of gdb:
>>> l *nvme_tcp_recv_skb+0x115e
>> It points to the 3rd WARN_ON msg. I've replaced those warnings with the patch you suggested below.
>>
>> (gdb) l *(nvme_tcp_recv_skb+0x115e)
>> 0x628e is in nvme_tcp_recv_skb (tcp.c:782).
>> 777
>> 778             nvme_tcp_setup_h2c_data_pdu(req);
>> 779
>> 780             WARN_ON(queue->request == req);
>> 781             WARN_ON(llist_on_list(&req->lentry));
>> 782             WARN_ON(!list_empty(&req->entry));
>> 783             llist_add(&req->lentry, &queue->req_list);
>> 784             if (list_empty(&queue->send_list))
>> 785                     queue_work_on(queue->io_cpu, nvme_tcp_wq, &queue->io_work);
>
>That is strange - should not happen afaict. You did not change the code
>and recompile
>before checking this correct?
No, no other code changes were done before checking above.

>>> I think this patch needs the following:
>>> --
>>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>>> index f2c829f1aff6..c2d4893ecb94 100644
>>> --- a/drivers/nvme/host/tcp.c
>>> +++ b/drivers/nvme/host/tcp.c
>>> @@ -767,8 +767,7 @@ static int nvme_tcp_handle_r2t(struct nvme_tcp_queue
>>> *queue,
>>>                  return -EPROTO;
>>>          }
>>>
>>> -       if (queue->request == req ||
>>> -           llist_on_list(&req->lentry) ||
>>> +       if (llist_on_list(&req->lentry) ||
>>>              !list_empty(&req->entry)) {
>>>                  dev_err(queue->ctrl->ctrl.device,
>>>                          "req %d unexpected r2t while processing request\n",
>>> --
>>>
>>> Or in your case - remove the above:
>>> --
>>>         WARN_ON(queue->request == req);
>>> --
>> Made that change. Tested and seeing this. -71 is -EPROTO that patch-2 returns .
>> Trying to narrow it down further. If you have any suggestion, let me know.
>>
>> [Mon Apr 21 11:18:58 2025] nvme nvme0: req 7 unexpected r2t while processing request
>> [Mon Apr 21 11:18:58 2025] nvme nvme0: receive failed:  -71
>> [Mon Apr 21 11:18:58 2025] nvme nvme0: starting error recovery
>> [Mon Apr 21 11:18:58 2025] nvme nvme0: req 11 unexpected r2t while processing request
>> [Mon Apr 21 11:18:58 2025] block nvme0n1: no usable path - requeuing I/O
>> [Mon Apr 21 11:18:58 2025] nvme nvme0: receive failed:  -71
The last patch had failed in under a minute. I had removed all WARN_ONs by this time.

>>>> [2025-04-16 22:56:24.758] [Wed Apr 16 22:56:26 2025] WARNING: CPU: 22 PID: 134878 at tcp.c:782 nvme_tcp_recv_skb+0x115e/0x11c0 [nvme_tcp]
>>> My assumption is that this is a result of
>>> WARN_ON(queue->request == req);
>>  From the gdb decode it points to WARN_ON(!list_empty(&req->entry))
>
>Can you please apply this on top:
>--
>diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>index 9453cf453d03..334154a58bdb 100644
>--- a/drivers/nvme/host/tcp.c
>+++ b/drivers/nvme/host/tcp.c
>@@ -453,7 +453,7 @@ nvme_tcp_fetch_request(struct nvme_tcp_queue *queue)
>                         return NULL;
>         }
>
>-       list_del(&req->entry);
>+       list_del_init(&req->entry);
>         init_llist_node(&req->lentry);
>         return req;
>  }
>--
list_del_init did the trick. Test ran successfully for >18 hours, while the
original failure took <6 hours. Can one of you please merge this in?
Which branch will this be targeted for?