crash at nvme_tcp_init_iter with header digest enabled
Sagi Grimberg
sagi at grimberg.me
Sun Aug 28 05:09:53 PDT 2022
> Hi,
>
> we got a customer bug report against our downstream kernel
> when doing fail over tests with header digest enabled.
>
> The whole crash looks like an user after free bug but
> so far we were not able to figure out where it happens.
>
> nvme nvme13: queue 1: header digest flag is cleared
> nvme nvme13: receive failed: -71
> nvme nvme13: starting error recovery
> nvme nvme7: Reconnecting in 10 seconds...
>
> RIP: nvme_tcp_init_iter
>
> nvme_tcp_recv_skb
> ? tcp_mstamp_refresh
> ? nvme_tcp_submit_async_event
> tcp_read_sock
> nvme_tcp_try_recv
> nvme_tcp_io_work
> process_one_work
> ? process_one_work
> worker_thread
> ? process_one_work
> kthread
> ? set_kthread_struct
> ret_from_fork
>
> In order to rule out that this caused by an reuse of a command id, I
> added a test patch which always clears the request pointer (see below)
> and hoped to see
>
> "got bad cqe.command_id %#x on queue %d\n"
>
> but there was none. Instead the crash disappeared. It looks like we are
> not clearing the request in the error path, but so far I haven't figured
> out how this is related to the header digest enabled.
>
> Anyway, this is just a FYI and in case anyone has an idea where to poke
> at; I am listening.
I think I see the problem. The stream is corrupted, and we keep
processing it.
The current logic says that once we hit a header-digest problem, we
immediately stop reading from the socket (rd_enabled=false) and trigger
error recovery.
When rd_enabled=false, we don't act on data_ready callbacks, as we know
we are tearing down the socket. However we may keep reading from the
socket if the io_work continues and calls try_recv again (mainly because
our error from nvme_tcp_recv_skb is not propagated back).
I think that this will make the issue go away:
--
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index e82dcfcda29b..3e3ebde4eff5 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1229,7 +1229,7 @@ static void nvme_tcp_io_work(struct work_struct *w)
else if (unlikely(result < 0))
return;
- if (!pending)
+ if (!pending || !queue->rd_enabled)
return;
} while (!time_after(jiffies, deadline)); /* quota is exhausted */
--
More information about the Linux-nvme
mailing list