nvme tcp receive errors

Sagi Grimberg sagi at grimberg.me
Mon May 10 19:18:24 BST 2021


> Sagi,
> 
> Just wanted to give you an update on where we're at with this.
> 
> All tests run with your earlier patch removing the inline dispatch from
> nvme_tcp_queue_request() are successful. At this point, I am leaning to
> remove that optimization from mainline.

Thanks Keith,

Did you run it with the extra information debug patch I sent you? What
I'm concerned about is that given that you have the only environment
where this reproduces, and this is removed it will be very difficult
to add it back in.

Also, what about the read issue? that one is still unresolved from
my PoV.

> I added additional tracing to see what is going on, but we eventually
> hit a memory issue after some hours of runtime. I've never seen an issue
> like this before, It triggers in nvme_tcp_advance_req() when tracing the
> rq->tag and req->data_sent:
> 
>    WARNING: CPU: 1 PID: 3428 at arch/x86/include/asm/kfence.h:44 kfence_protect_page+0x33/0xa0
> 
> I think the above is a distraction, but I can provide the full stack
> trace and patch adding the tracepoing if you think it's helpful.

That is... odd..



More information about the Linux-nvme mailing list