Request timeout seen with NVMEoF TCP

Potnuri Bharat Teja bharat at chelsio.com
Thu Dec 10 12:06:47 EST 2020


On Wednesday, December 12/09/20, 2020 at 10:02:57 -0800, Sagi Grimberg wrote:
> 
> > Hi All,
> > I am seeing the following timeouts and reconnects on NVMF TCP initiator with latest v5.10-rc5
> > kernel.
> > I see the same behavior with nvme tree too (branch:nvme-5.11)
> > I last ran this with 5.8, where it was running fine.
> > 
> > Target configuration is, 1 target with 1gb ramdisk namespace. On intiator,
> > discover, connect and run fio or Iozone. Traces are seen within couple of minutes
> > after starting the test.
> 
> Hey Potnuri,
> 
> Can you also attach the target side logs? it seems like an I/O times out
> for no apparent reason..

I see nothing much logged on target.

Target dmesg:
---
[ 8371.979029] brd: module loaded
[ 8371.983368] nvmet: adding nsid 1 to subsystem nvme-ram0
[ 8371.988189] nvmet: adding nsid 1 to subsystem nvme-ram1
[ 8371.992766] nvmet: adding nsid 1 to subsystem nvme-ram2
[ 8371.997107] nvmet: adding nsid 1 to subsystem nvme-ram3
[ 8372.001881] nvmet_tcp: enabling port 1 (10.0.3.2:4420)
[ 8407.988148] nvmet: creating controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
[ 8417.287799] nvmet: creating controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
[ 8417.328557] nvmet: creating controller 1 for subsystem nvme-ram0 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
[ 8417.365155] nvmet: creating controller 2 for subsystem nvme-ram1 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
[ 8417.402337] nvmet: creating controller 3 for subsystem nvme-ram2 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
[ 8417.439785] nvmet: creating controller 4 for subsystem nvme-ram3 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.

[ 8511.903509] nvmet: creating controller 2 for subsystem nvme-ram1 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
[ 8552.862271] nvmet: creating controller 1 for subsystem nvme-ram0 for NQN nqn.2014-08.org.nvmexpress:uuid:77f6ffad-1c4a-4c0e-9f11-23cd4daf0216.
^^^^^^^ Logs due to reconnect ^^^^^^^^
---

Initiator dmesg:
---
[  923.976468] nvme nvme0: Removing ctrl: NQN "nvme-ram0"
[  985.213878] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 10.0.3.2:4420
[  985.215489] nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
[  994.513541] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 10.0.3.2:4420
[  994.515025] nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
[  994.554853] nvme nvme0: creating 16 I/O queues.
[  994.559725] nvme nvme0: mapped 16/0/0 default/read/poll queues.
[  994.565373] nvme nvme0: new ctrl: NQN "nvme-ram0", addr 10.0.3.2:4420
[  994.591559] nvme nvme1: creating 16 I/O queues.
[  994.596472] nvme nvme1: mapped 16/0/0 default/read/poll queues.
[  994.602392] nvme nvme1: new ctrl: NQN "nvme-ram1", addr 10.0.3.2:4420
[  994.628442] nvme nvme2: creating 16 I/O queues.
[  994.633391] nvme nvme2: mapped 16/0/0 default/read/poll queues.
[  994.639256] nvme nvme2: new ctrl: NQN "nvme-ram2", addr 10.0.3.2:4420
[  994.665958] nvme nvme3: creating 16 I/O queues.
[  994.670913] nvme nvme3: mapped 16/0/0 default/read/poll queues.
[  994.676894] nvme nvme3: new ctrl: NQN "nvme-ram3", addr 10.0.3.2:4420
[ 1007.040464] EXT4-fs (nvme0n1): mounting ext3 file system using the ext4 subsystem
[ 1007.044448] EXT4-fs (nvme0n1): mounted filesystem with ordered data mode. Opts: (null)
[ 1007.070111] EXT4-fs (nvme1n1): mounting ext3 file system using the ext4 subsystem
[ 1007.073819] EXT4-fs (nvme1n1): mounted filesystem with ordered data mode. Opts: (null)
[ 1007.099188] EXT4-fs (nvme2n1): mounting ext3 file system using the ext4 subsystem
[ 1007.102578] EXT4-fs (nvme2n1): mounted filesystem with ordered data mode. Opts: (null)
[ 1007.126442] EXT4-fs (nvme3n1): mounting ext3 file system using the ext4 subsystem
[ 1007.130367] EXT4-fs (nvme3n1): mounted filesystem with ordered data mode. Opts: (null)
[ 1078.867439] nvme nvme1: queue 14: timeout request 0x33 type 6
[ 1078.868266] nvme nvme1: starting error recovery
[ 1078.868775] nvme nvme1: queue 14: timeout request 0x34 type 6
[ 1078.869853] nvme nvme1: queue 14: timeout request 0x35 type 6
[ 1078.869908] nvme nvme1: queue 14: timeout request 0x36 type 4
[ 1078.869958] nvme nvme1: queue 14: timeout request 0x37 type 4
[ 1078.870586] block nvme1n1: no usable path - requeuing I/O
[ 1078.870686] block nvme1n1: no usable path - requeuing I/O
[ 1078.871752] block nvme1n1: no usable path - requeuing I/O
[ 1078.872573] nvme nvme1: Reconnecting in 10 seconds...
[ 1078.872949] block nvme1n1: no usable path - requeuing I/O
[ 1078.874802] block nvme1n1: no usable path - requeuing I/O
[ 1078.875685] block nvme1n1: no usable path - requeuing I/O
[ 1078.876588] block nvme1n1: no usable path - requeuing I/O
[ 1078.877456] block nvme1n1: no usable path - requeuing I/O
[ 1078.878248] block nvme1n1: no usable path - requeuing I/O
[ 1078.879115] block nvme1n1: no usable path - requeuing I/O
[ 1089.130331] nvme_ns_head_submit_bio: 5 callbacks suppressed
[ 1089.131050] block nvme1n1: no usable path - requeuing I/O
[ 1089.131925] block nvme1n1: no usable path - requeuing I/O
[ 1089.132619] block nvme1n1: no usable path - requeuing I/O
[ 1089.133446] block nvme1n1: no usable path - requeuing I/O
[ 1089.134082] block nvme1n1: no usable path - requeuing I/O
[ 1089.134724] block nvme1n1: no usable path - requeuing I/O
[ 1089.135464] block nvme1n1: no usable path - requeuing I/O
[ 1089.136070] block nvme1n1: no usable path - requeuing I/O
[ 1089.137102] nvme nvme1: creating 16 I/O queues.
[ 1089.181184] nvme nvme1: mapped 16/0/0 default/read/poll queues.
[ 1089.196437] nvme nvme1: Successfully reconnected (1 attempt)
[ 1119.827490] nvme nvme0: queue 16: timeout request 0x1a type 6
[ 1119.828223] nvme nvme0: starting error recovery
[ 1119.828801] nvme nvme0: queue 16: timeout request 0x1b type 4
[ 1119.829456] block nvme0n1: no usable path - requeuing I/O
[ 1119.830006] block nvme0n1: no usable path - requeuing I/O
[ 1119.830337] nvme nvme0: Reconnecting in 10 seconds...
[ 1119.830664] block nvme0n1: no usable path - requeuing I/O
[ 1130.088305] block nvme0n1: no usable path - requeuing I/O
[ 1130.089473] block nvme0n1: no usable path - requeuing I/O
[ 1130.090319] nvme nvme0: creating 16 I/O queues.
[ 1130.132996] nvme nvme0: mapped 16/0/0 default/read/poll queues.
[ 1130.133572] debugfs: Directory 'hctx0' with parent '/' already present!
[ 1130.134086] debugfs: Directory 'hctx1' with parent '/' already present!
[ 1130.134669] debugfs: Directory 'hctx2' with parent '/' already present!
[ 1130.135196] debugfs: Directory 'hctx3' with parent '/' already present!
[ 1130.135766] debugfs: Directory 'hctx4' with parent '/' already present!
[ 1130.137374] debugfs: Directory 'hctx5' with parent '/' already present!
[ 1130.138070] debugfs: Directory 'hctx6' with parent '/' already present!
[ 1130.138782] debugfs: Directory 'hctx7' with parent '/' already present!
[ 1130.139553] debugfs: Directory 'hctx8' with parent '/' already present!
[ 1130.140114] debugfs: Directory 'hctx9' with parent '/' already present!
[ 1130.140728] debugfs: Directory 'hctx10' with parent '/' already present!
[ 1130.141502] debugfs: Directory 'hctx11' with parent '/' already present!
[ 1130.142111] debugfs: Directory 'hctx12' with parent '/' already present!
[ 1130.142763] debugfs: Directory 'hctx13' with parent '/' already present!
[ 1130.143535] debugfs: Directory 'hctx14' with parent '/' already present!
[ 1130.144114] debugfs: Directory 'hctx15' with parent '/' already present!
[ 1130.150104] nvme nvme0: Successfully reconnected (1 attempt)
---

I see the timeouts almost immediately after starting iozone. As expected I see 
IO blocked for a couple of seconds untill host reconnects back.

Here is my test:
-> create 4 NVMF-TCP targets with 1gb ramdisk namespace each.
-> discover and connect all targets.
-> format nvme devices as ext4, mount them. (# mkfs.ext4 -F /dev/nvme0n1)
-> run iozone over all mounted disks (# iozone -a -g 512m)

What else can I check?



More information about the Linux-nvme mailing list