Request timeout seen with NVMEoF TCP

Mon Dec 14 14:27:40 EST 2020

Hi Sagi,

Without instrumenting the driver it's hard to say what might be happening here. But I did make a few comments at the end of my initial email which might be relevant:

1. It seems abnormal to me that the direct send path does not check the return value of nvme_tcp_try_send(), given that this routine at least claims to be able to fail transiently. nvme_tcp_queue_request should reschedule the workqueue if nvme_tcp_try_send() does not return 1, IMHO.

2. If, for whatever reason, kernel_sendpage or sock_no_sendpage returns -EAGAIN, the nvme_tcp_try_send() returns 0. This will be interpreted by nvme_tcp_io_work() as meaning there is nothing more to do. This is wrong, because in fact there is more work to do and nvme_tcp_io_work() should reschedule itself in this case. Unfortunately, as the system is coded, nvme_tcp_io_work() has no way of distinguishing between "nvme_tcp_try_send returned 0 because there is nothing to do" and "nvme_tcp_try_send returned 0 because of a transient failure".

Not sure how possible these cases are in practice but theoretically they can occur..

Regards
Samuel

----- Original Message -----
From: "Potnuri Bharat Teja" <bharat at chelsio.com>
To: "Sagi Grimberg" <sagi at grimberg.me>
Cc: hch at lst.de, "linux-nvme" <linux-nvme at lists.infradead.org>, "Samuel Jones" <sjones at kalrayinc.com>
Sent: Monday, 14 December, 2020 19:05:31
Subject: Re: Request timeout seen with NVMEoF TCP

On Monday, December 12/14/20, 2020 at 09:51:28 -0800, Sagi Grimberg wrote:
> 
> > Hi Sagi,
> > With above patch I still see the issue but less frequently. Without patch I was
> > able to consistently reproduce the timouts with 4 target devices. With patch I
> > see IO running fine for 4 targets. Tried the same test with 8 target devices
> > and I see the below timeout. I've observed only one instance of timeout. So,
> > I'll let it run for somemore time or rerun and update.
> 
> Hey Potnuri,
> 
> Have you observed this further?
> 
> I'd think that if the io_work reschedule itself when it races
> with the direct send path this should not happen, but we may be
> seeing a different race going on here, adding Samuel who saw
> a similar phenomenon.
> 

Yes, I have seen it further too but very intermittently. Below is the dmesg that
I saved before my test got stopped for some other reason. From the timestamps, 
timeouts were around ~20min, ~45min and ~3 hours apart.

dmesg on intiator:
---
[ 1735.038634] EXT4-fs (nvme7n1): mounted filesystem with ordered data mod                                                                                e. Opts: (null)
[ 2111.990419] nvme nvme5: queue 7: timeout request 0x57 type 4
[ 2111.991835] nvme nvme5: starting error recovery
[ 2111.998796] block nvme5n1: no usable path - requeuing I/O
[ 2111.998816] nvme nvme5: Reconnecting in 10 seconds...
[ 2122.253431] block nvme5n1: no usable path - requeuing I/O
[ 2122.254732] nvme nvme5: creating 16 I/O queues.
[ 2122.301169] nvme nvme5: mapped 16/0/0 default/read/poll queues.
[ 2122.314229] nvme nvme5: Successfully reconnected (1 attempt)
[ 2522.795467] perf: interrupt took too long (2503 > 2500), lowering kerne                                                                                l.perf_event_max_sample_rate to 79000
[ 3043.544758] perf: interrupt took too long (3135 > 3128), lowering kerne                                                                                l.perf_event_max_sample_rate to 63000
[ 4197.260041] perf: interrupt took too long (3922 > 3918), lowering kerne                                                                                l.perf_event_max_sample_rate to 50000
[ 4848.175367] nvme nvme1: queue 6: timeout request 0x42 type 4
[ 4848.176086] nvme nvme1: starting error recovery
[ 4848.176864] nvme nvme1: queue 6: timeout request 0x43 type 4
[ 4848.178194] nvme nvme1: queue 6: timeout request 0x44 type 4
[ 4848.179873] block nvme1n1: no usable path - requeuing I/O
[ 4848.180665] block nvme1n1: no usable path - requeuing I/O
[ 4848.180775] nvme nvme1: Reconnecting in 10 seconds...
[ 4848.181372] block nvme1n1: no usable path - requeuing I/O
[ 4848.183022] block nvme1n1: no usable path - requeuing I/O
[ 4848.183782] block nvme1n1: no usable path - requeuing I/O
[ 4858.436889] block nvme1n1: no usable path - requeuing I/O
[ 4858.437558] block nvme1n1: no usable path - requeuing I/O
[ 4858.438168] block nvme1n1: no usable path - requeuing I/O
[ 4858.439547] nvme nvme1: creating 16 I/O queues.
[ 4858.489650] nvme nvme1: mapped 16/0/0 default/read/poll queues.
[ 4858.490357] debugfs: Directory 'hctx0' with parent '/' already present!
[ 4858.491256] debugfs: Directory 'hctx1' with parent '/' already present!
[ 4858.492790] debugfs: Directory 'hctx2' with parent '/' already present!
[ 4858.493601] debugfs: Directory 'hctx3' with parent '/' already present!
[ 4858.494362] debugfs: Directory 'hctx4' with parent '/' already present!
[ 4858.495310] debugfs: Directory 'hctx5' with parent '/' already present!
[ 4858.496261] debugfs: Directory 'hctx6' with parent '/' already present!
[ 4858.497098] debugfs: Directory 'hctx7' with parent '/' already present!
[ 4858.497972] debugfs: Directory 'hctx8' with parent '/' already present!
[ 4858.498806] debugfs: Directory 'hctx9' with parent '/' already present!
[ 4858.499574] debugfs: Directory 'hctx10' with parent '/' already present                                                                                !
[ 4858.500268] debugfs: Directory 'hctx11' with parent '/' already present                                                                                !
[ 4858.501030] debugfs: Directory 'hctx12' with parent '/' already present                                                                                !
[ 4858.501807] debugfs: Directory 'hctx13' with parent '/' already present                                                                                !
[ 4858.502459] debugfs: Directory 'hctx14' with parent '/' already present                                                                                !
[ 4858.503196] debugfs: Directory 'hctx15' with parent '/' already present                                                                                !
[ 4858.510075] nvme nvme1: Successfully reconnected (1 attempt)
[12905.117166] perf: interrupt took too long (4944 > 4902), lowering kernel.perf_event_max_sample_rate to 40000
[15436.031342] nvme nvme0: queue 16: timeout request 0x22 type 4
[15436.031920] nvme nvme0: starting error recovery
[15436.033746] block nvme0n1: no usable path - requeuing I/O
[15436.035221] nvme nvme0: Reconnecting in 10 seconds...
[15446.292544] block nvme0n1: no usable path - requeuing I/O
[15446.293541] nvme nvme0: creating 16 I/O queues.
[15446.339595] nvme nvme0: mapped 16/0/0 default/read/poll queues.
[15446.340846] debugfs: Directory 'hctx0' with parent '/' already present!
[15446.342664] debugfs: Directory 'hctx1' with parent '/' already present!
[15446.343413] debugfs: Directory 'hctx2' with parent '/' already present!
[15446.344228] debugfs: Directory 'hctx3' with parent '/' already present!
[15446.345155] debugfs: Directory 'hctx4' with parent '/' already present!
[15446.345948] debugfs: Directory 'hctx5' with parent '/' already present!
[15446.346767] debugfs: Directory 'hctx6' with parent '/' already present!
[15446.347368] debugfs: Directory 'hctx7' with parent '/' already present!
[15446.348129] debugfs: Directory 'hctx8' with parent '/' already present!
[15446.348948] debugfs: Directory 'hctx9' with parent '/' already present!
[15446.349538] debugfs: Directory 'hctx10' with parent '/' already present!
[15446.350185] debugfs: Directory 'hctx11' with parent '/' already present!
[15446.350757] debugfs: Directory 'hctx12' with parent '/' already present!
[15446.351539] debugfs: Directory 'hctx13' with parent '/' already present!
[15446.352200] debugfs: Directory 'hctx14' with parent '/' already present!
[15446.352722] debugfs: Directory 'hctx15' with parent '/' already present!
[15446.358750] nvme nvme0: Successfully reconnected (1 attempt)

---

> > 
> > Target dmesg:
> > ---
> > [ 1704.132366] nvmet: creating controller 1 for subsystem nvme-ram0 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
> > [ 1704.185987] nvmet: creating controller 2 for subsystem nvme-ram1 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
> > [ 1704.230065] nvmet: creating controller 3 for subsystem nvme-ram2 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
> > [ 1704.277712] nvmet: creating controller 4 for subsystem nvme-ram3 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
> > [ 1704.314457] nvmet: creating controller 5 for subsystem nvme-ram4 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
> > [ 1704.370124] nvmet: creating controller 6 for subsystem nvme-ram5 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
> > [ 1704.435581] nvmet: creating controller 7 for subsystem nvme-ram6 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
> > [ 1704.501813] nvmet: creating controller 8 for subsystem nvme-ram7 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
> > [ 2103.965017] nvmet: creating controller 6 for subsystem nvme-ram5 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
> > ^^^^^^^^^^^^^^^
> > ---
> > 
> > Initiator dmesg:
> > ---
> > [ 1735.038634] EXT4-fs (nvme7n1): mounted filesystem with ordered data mode. Opts: (null)
> > [ 2111.990419] nvme nvme5: queue 7: timeout request 0x57 type 4
> > [ 2111.991835] nvme nvme5: starting error recovery
> > [ 2111.998796] block nvme5n1: no usable path - requeuing I/O
> > [ 2111.998816] nvme nvme5: Reconnecting in 10 seconds...
> > [ 2122.253431] block nvme5n1: no usable path - requeuing I/O
> > [ 2122.254732] nvme nvme5: creating 16 I/O queues.
> > [ 2122.301169] nvme nvme5: mapped 16/0/0 default/read/poll queues.
> > [ 2122.314229] nvme nvme5: Successfully reconnected (1 attempt)
> > ---
> > 
> > 
> > Thanks.
> >