nvme-tcp: i/o errors and stalled host from failure to send command pdus

Thu Sep 1 05:37:30 PDT 2022

On Wed, Aug 31, 2022 at 7:42 PM Sagi Grimberg <sagi at grimberg.me> wrote:
>
>
> >
> > Here's a trace including state from nvme_tcp_try_send_cmd_pdu(), like so.
> >
> > req->dbg.send = 1;
> > ret = kernel_sendpage(queue->sock, virt_to_page(pdu), offset, len, flags);
> > req->dbg.sendpage = ret;
> >
> >
> > nvme-trace.out
> >
> > tar-1785    [005] .....   232.768892:nvme_tcp_queue_rq:
> >   nvme1: qid=6 tag=18 op=0 data_len=20480
> > tar-1785    [005] .....   232.768894: nvme_tcp_queue_rq:
> >   nvme1: qid=6 tag=17 op=0 data_len=32768
> > tar-1785    [005] .N...   232.768895: nvme_tcp_queue_rq:
> >   nvme1: qid=6 tag=14 op=0 data_len=32768
> > kworker/5:1H-475     [005] .....   232.768923: nvme_tcp_queue_rq:
> >   nvme1: qid=6 tag=13 op=0 data_len=32768
> > kworker/5:1H-475     [005] .....   232.768924: nvme_tcp_queue_rq:
> >   nvme1: qid=6 tag=12 op=0 data_len=12288
> > tar-1785    [007] .....   232.769141: nvme_tcp_queue_rq:
> >   nvme1: qid=8 tag=69 op=0 data_len=20480
> >
> >
> > dmesg.out (reordered to match nvme-trace.out)
> >
> > [  262.889536] nvme nvme1: state: tag 0x18 io_cpu 5 smp_id 5 smp_id2 5 sync 1
> >   empty 1 lock 1 last 0 more 0 send 0 sendpage -1
>
> Something is not adding up...
> lock=1, means that we get into nvme_tcp_send_all(), but somehow we are
> not able to send it to the wire?
>
> I'd look into this and add indication that it is pulled from the
> req_list, and in turn from the send_list. It looks like it is somehow
> picked out and silteny dropped, because as long as it is on the list(s)
> another context should have picked it up at some point...
>
> > [  262.889337] nvme nvme1: state: tag 0x17 io_cpu 5 smp_id 5 smp_id2 5 sync 1
> >   empty 1 lock 1 last 0 more 0 send 0 sendpage -1
> > [  262.889110] nvme nvme1: state: tag 0x14 io_cpu 5 smp_id 5 smp_id2 7 sync 1
> >   empty 1 lock 1 last 0 more 0 send 0 sendpage -1
> > [  262.888864] nvme nvme1: state: tag 0x13 io_cpu 5 smp_id 5 smp_id2 5 sync 1
> >   empty 0 lock 0 last 0 more 1 send 1 sendpage 72
> > [  262.888727] nvme nvme1: state: tag 0x12 io_cpu 5 smp_id 5 smp_id2 5 sync 1
> >   empty 0 lock 0 last 1 more 1 send 1 sendpage 72

Ugh. It was misleading because the send state tracking was
re-initialized after the direct send case. Here's a better trace with
more information. It appears as though sendpage() is always called
with the NOTLAST flag set. I suppose this could mean that there is
some confusion or race with nvme_tcp_queue_more()?

tar-1872          [004] ..... 389.559784: nvme_tcp_queue_rq:
 nvme1: qid=5 tag=28 op=0 data_len=16384
tar-1872          [004] .N... 389.559786: nvme_tcp_queue_rq:
 nvme1: qid=5 tag=27 op=0 data_len=16384
kworker/4:2H-1873 [004] ..... 389.559798: nvme_tcp_queue_rq:
 nvme1: qid=5 tag=26 op=0 data_len=32768
kworker/4:2H-1873 [004] ..... 389.559798: nvme_tcp_queue_rq:
 nvme1: qid=5 tag=25 op=0 data_len=32768
kworker/4:2H-1873 [004] ..... 389.559799: nvme_tcp_queue_rq:
 nvme1: qid=5 tag=24 op=0 data_len=32768

[  419.790505] nvme nvme1: state: tag 0x28 io_cpu 4 smp_id 4 smp_id2 4 sync 1
 empty 1 lock 1 last 0 more 0 send 1 send_ret 72 send_flags 0x28040
[  419.790299] nvme nvme1: state: tag 0x27 io_cpu 4 smp_id 4 smp_id2 0 sync 1
 empty 1 lock 1 last 0 more 0 send 1 send_ret 72 send_flags 0x28040
[  419.789809] nvme nvme1: state: tag 0x25 io_cpu 4 smp_id 4 smp_id2 4 sync 1
 empty 0 lock 0 last 0 more 1 send 1 send_ret 72 send_flags 0x28040
[  419.790065] nvme nvme1: state: tag 0x26 io_cpu 4 smp_id 4 smp_id2 4 sync 1
 empty 0 lock 0 last 0 more 1 send 1 send_ret 72 send_flags 0x28040
[  419.789663] nvme nvme1: state: tag 0x24 io_cpu 4 smp_id 4 smp_id2 4 sync 1
 empty 0 lock 0 last 1 more 1 send 1 send_ret 72 send_flags 0x28040