[PATCH 3/3] nvme-tcp: fix I/O stalls on congested sockets
Kamaljit Singh
Kamaljit.Singh1 at wdc.com
Mon Apr 21 12:11:48 PDT 2025
Sagi,
>However the log prints are WARN messages that does not exist in any of
>the patches.
Just found the issue with my patch integration. I had picked up Hannes' patch in email
titled "[PATCH 2/3] nvme-tcp: sanitize request list handling" and dated Fri Match 7th.
This had the 3 WARN_ON msgs.
I just found another patch-set with almost the same title dated March 27th.
I'll integrate this one "PATCH 2/5] nvme-tcp: sanitize request list handling".
This replaced those WARN_ON msgs w/ dev_err() after if-checks for the same
conditions.
>> The patches I applied are these:
>> 1: nvme-tcp: partial write tcp-q-wake-sender untested by Sagi
>> 2: nvme-tcp: recv-side EAGAIN untested fix by Sagi
>> 3: nvme-tcp: fix I/O stalls on congested sockets
>
>The patch I suggested was a replacement for patch (3). Didn't you
>get conflicts?
No. But I reviewed it again and have undone this part of patch-3 in my setup.
- else if (unlikely(result < 0))
+ else if (unlikely(result < 0) && result != -EAGAIN)
>> 4: nvme-tcp: sanitize request list handling
>> 5: nvme-tcp: avoid inline sending when handling R2T PDUs
>
>> [4] nvme-tcp: sanitize request list handling
>>
>> Validate the request in nvme_tcp_handle_r2t() to ensure it's not
>> part of any list, otherwise a malicious R2T PDU might inject a
>> loop in request list processing.
>>
>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>> index 40dce5dc3ff5..4647d5d547f8 100644
>> --- a/drivers/nvme/host/tcp.c
>> +++ b/drivers/nvme/host/tcp.c
>> @@ -455,6 +455,7 @@ nvme_tcp_fetch_request(struct nvme_tcp_queue *queue)
>> }
>>
>> list_del(&req->entry);
>> + init_llist_node(&req->lentry);
>> return req;
>> }
>>
>> @@ -562,6 +563,8 @@ static int nvme_tcp_init_request(struct blk_mq_tag_set *set,
>> req->queue = queue;
>> nvme_req(rq)->ctrl = &ctrl->ctrl;
>> nvme_req(rq)->cmd = &pdu->cmd;
>> + init_llist_node(&req->lentry);
>> + INIT_LIST_HEAD(&req->entry);
>>
>> return 0;
>> }
>> @@ -773,8 +776,12 @@ static int nvme_tcp_handle_r2t(struct nvme_tcp_queue *queue,
>>
>> nvme_tcp_setup_h2c_data_pdu(req);
>>
>> + WARN_ON(queue->request == req);
>> + WARN_ON(llist_on_list(&req->lentry));
>> + WARN_ON(!list_empty(&req->entry));
>
>These are not included in Hannes patch afaict.
They were, as I noted on above.
>I'm assuming that these are the stack traces that are firing...
>
>Can you please share the output of gdb:
>l *nvme_tcp_recv_skb+0x115e
It points to the 3rd WARN_ON msg. I've replaced those warnings with the patch you suggested below.
(gdb) l *(nvme_tcp_recv_skb+0x115e)
0x628e is in nvme_tcp_recv_skb (tcp.c:782).
777
778 nvme_tcp_setup_h2c_data_pdu(req);
779
780 WARN_ON(queue->request == req);
781 WARN_ON(llist_on_list(&req->lentry));
782 WARN_ON(!list_empty(&req->entry));
783 llist_add(&req->lentry, &queue->req_list);
784 if (list_empty(&queue->send_list))
785 queue_work_on(queue->io_cpu, nvme_tcp_wq, &queue->io_work);
>
>My assumption is that in your case, queue->request is still referenced
>with the current request when the r2t for it arrives, which is very much
>possible...
>
>So this means that the patch "nvme-tcp: sanitize request list handling"
>is wrong, queue->request may still be referenced when its r2t arrives.
>
>I think this patch needs the following:
>--
>diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>index f2c829f1aff6..c2d4893ecb94 100644
>--- a/drivers/nvme/host/tcp.c
>+++ b/drivers/nvme/host/tcp.c
>@@ -767,8 +767,7 @@ static int nvme_tcp_handle_r2t(struct nvme_tcp_queue
>*queue,
> return -EPROTO;
> }
>
>- if (queue->request == req ||
>- llist_on_list(&req->lentry) ||
>+ if (llist_on_list(&req->lentry) ||
> !list_empty(&req->entry)) {
> dev_err(queue->ctrl->ctrl.device,
> "req %d unexpected r2t while processing request\n",
>--
>
>Or in your case - remove the above:
>--
> WARN_ON(queue->request == req);
>--
Made that change. Tested and seeing this. -71 is -EPROTO that patch-2 returns .
Trying to narrow it down further. If you have any suggestion, let me know.
[Mon Apr 21 11:18:58 2025] nvme nvme0: req 7 unexpected r2t while processing request
[Mon Apr 21 11:18:58 2025] nvme nvme0: receive failed: -71
[Mon Apr 21 11:18:58 2025] nvme nvme0: starting error recovery
[Mon Apr 21 11:18:58 2025] nvme nvme0: req 11 unexpected r2t while processing request
[Mon Apr 21 11:18:58 2025] block nvme0n1: no usable path - requeuing I/O
[Mon Apr 21 11:18:58 2025] nvme nvme0: receive failed: -71
>> [2025-04-16 22:56:24.758] [Wed Apr 16 22:56:26 2025] WARNING: CPU: 22 PID: 134878 at tcp.c:782 nvme_tcp_recv_skb+0x115e/0x11c0 [nvme_tcp]
>
>My assumption is that this is a result of
>WARN_ON(queue->request == req);
>From the gdb decode it points to WARN_ON(!list_empty(&req->entry))
More information about the Linux-nvme
mailing list