Data corruption when using multiple devices with NVMEoF TCP

Tue Dec 22 14:29:22 EST 2020

Hey Hao,

> I'm using kernel 5.2.9 with following related configs enabled:
> CONFIG_NVME_CORE=y
> CONFIG_BLK_DEV_NVME=y
> CONFIG_NVME_MULTIPATH=y
> CONFIG_NVME_FABRICS=m
> # CONFIG_NVME_FC is not set
> CONFIG_NVME_TCP=m
> CONFIG_NVME_TARGET=m
> CONFIG_NVME_TARGET_LOOP=m
> # CONFIG_NVME_TARGET_FC is not set
> CONFIG_NVME_TARGET_TCP=m
> CONFIG_RTC_NVMEM=y
> CONFIG_NVMEM=y
> CONFIG_NVMEM_SYSFS=y
> 
> On target side, I exported 2 NVMe devices using tcp/ipv6:
> [root at rtptest34337.prn2 ~/ext_nvme]# ll
> /sys/kernel/config/nvmet/ports/1/subsystems/
> total 0
> lrwxrwxrwx 1 root root 0 Dec 19 02:08 nvmet-rtptest34337-1 ->
> ../../../../nvmet/subsystems/nvmet-rtptest34337-1
> lrwxrwxrwx 1 root root 0 Dec 19 02:08 nvmet-rtptest34337-2 ->
> ../../../../nvmet/subsystems/nvmet-rtptest34337-2
> 
> On initiator side, I could successfully connect the 2 nvme devices,
> nvme1n1 & nvme2n1;
> [root at rtptest34206.prn2 /]# nvme list
> Node             SN                   Model
> Namespace          Usage                      Format           FW Rev
> ---------------- --------------------
> ---------------------------------------- ---------
> -------------------------- ---------------- --------
> /dev/nvme0n1     ***********     INTEL *******          1
> 256.06  GB / 256.06  GB    512   B +  0 B    PSF119D
> /dev/nvme1n1     ***********     Linux                       1
> 900.19  GB / 900.19  GB      4 KiB +  0 B     5.2.9-0_
> /dev/nvme2n1     ***********     Linux                       1
> 900.19  GB / 900.19  GB      4 KiB +  0 B     5.2.9-0_
> 
> Then for each of nvme1n1 & nvme2n1, I created a partition using fdisk;
> type is "linux raid autodetect";
> Next I created a RAID-0 volume using, created a filesystem on it, and
> mounted itL
> # mdadm --create /dev/md5 --level=0 --raid-devices=2 --chunk=128
> /dev/nvme1n1p1 /dev/nvme2n1p1
> # mkfs.xfs -f /dev/md5
> # mkdir /flash
> # mount -o rw,noatime,discard /dev/md5 /flash/
> 
> Now, when I copy a large directory into /flash/, a lot of files under
> /flash/ are corrupted.
> Specifically, that large directory has a lot of .gz files, and unzip
> will fail on many of them;
> also diff with original files does show they are different, although
> the file size is exactly the same.

Sounds strange to me. Nothing forbids mounting a fs on a raid0 volume.

> Also I found that if I don't create a RAID-0 array, instead just make
> a filesystem on either /dev/nvme1n1p1 or /dev/nvme2n1p1, there is no
> data corruption.
> 
> I'm wondering if there is a known issue, or I'm doing something not
> really supported.

Did you try to run the same test locally on the target side without
having nvme-tcp/nvmet-tcp target in between?