crash at nvme_tcp_init_iter with header digest enabled
Sagi Grimberg
sagi at grimberg.me
Mon Sep 5 02:46:40 PDT 2022
>> Hi,
>>
>> we got a customer bug report against our downstream kernel
>> when doing fail over tests with header digest enabled.
>>
>> The whole crash looks like an user after free bug but
>> so far we were not able to figure out where it happens.
>>
>> nvme nvme13: queue 1: header digest flag is cleared
>> nvme nvme13: receive failed: -71
>> nvme nvme13: starting error recovery
>> nvme nvme7: Reconnecting in 10 seconds...
>>
>> RIP: nvme_tcp_init_iter
>>
>> nvme_tcp_recv_skb
>> ? tcp_mstamp_refresh
>> ? nvme_tcp_submit_async_event
>> tcp_read_sock
>> nvme_tcp_try_recv
>> nvme_tcp_io_work
>> process_one_work
>> ? process_one_work
>> worker_thread
>> ? process_one_work
>> kthread
>> ? set_kthread_struct
>> ret_from_fork
>>
>> In order to rule out that this caused by an reuse of a command id, I
>> added a test patch which always clears the request pointer (see below)
>> and hoped to see
>>
>> "got bad cqe.command_id %#x on queue %d\n"
>>
>> but there was none. Instead the crash disappeared. It looks like we are
>> not clearing the request in the error path, but so far I haven't figured
>> out how this is related to the header digest enabled.
>>
>> Anyway, this is just a FYI and in case anyone has an idea where to poke
>> at; I am listening.
>
> I think I see the problem. The stream is corrupted, and we keep
> processing it.
>
> The current logic says that once we hit a header-digest problem, we
> immediately stop reading from the socket (rd_enabled=false) and trigger
> error recovery.
>
> When rd_enabled=false, we don't act on data_ready callbacks, as we know
> we are tearing down the socket. However we may keep reading from the
> socket if the io_work continues and calls try_recv again (mainly because
> our error from nvme_tcp_recv_skb is not propagated back).
>
> I think that this will make the issue go away:
> --
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index e82dcfcda29b..3e3ebde4eff5 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1229,7 +1229,7 @@ static void nvme_tcp_io_work(struct work_struct *w)
> else if (unlikely(result < 0))
> return;
>
> - if (!pending)
> + if (!pending || !queue->rd_enabled)
> return;
>
> } while (!time_after(jiffies, deadline)); /* quota is exhausted */
> --
Daniel, any input here?
More information about the Linux-nvme
mailing list