Data corruption when using multiple devices with NVMEoF TCP

Hao Wang pkuwangh at gmail.com
Mon Jan 11 03:56:09 EST 2021


Hey Sagi,

I exported 4 devices to the initiator, created a raid-0 array, and
copied a 98G directory with many ~100MB .gz files.
With the patch you give on top of 58cf05f597b0 (fairly new), I saw
about 24K prints from dmesg. Below are some of them:
[ 3775.256547] nvme_tcp: rq 22 (READ) data_len 131072 bio[1/2] sector
a388200 bvec: nsegs 19 size 77824 offset 0
[ 3775.256768] nvme_tcp: rq 19 (READ) data_len 131072 bio[1/2] sector
a388300 bvec: nsegs 19 size 77824 offset 0
[ 3775.256774] nvme_tcp: rq 20 (READ) data_len 131072 bio[1/2] sector
a388400 bvec: nsegs 19 size 77824 offset 0
[ 3775.256787] nvme_tcp: rq 5 (READ) data_len 131072 bio[1/2] sector
a388300 bvec: nsegs 19 size 77824 offset 0
[ 3775.256791] nvme_tcp: rq 6 (READ) data_len 131072 bio[1/2] sector
a388400 bvec: nsegs 19 size 77824 offset 0
[ 3775.256794] nvme_tcp: rq 117 (READ) data_len 131072 bio[1/2] sector
a388300 bvec: nsegs 19 size 77824 offset 0
[ 3775.256797] nvme_tcp: rq 118 (READ) data_len 131072 bio[1/2] sector
a388400 bvec: nsegs 19 size 77824 offset 0
[ 3775.256800] nvme_tcp: rq 5 (READ) data_len 262144 bio[1/4] sector
a388300 bvec: nsegs 19 size 77824 offset 0
[ 3775.257002] nvme_tcp: rq 21 (READ) data_len 131072 bio[1/2] sector
a388500 bvec: nsegs 19 size 77824 offset 0
[ 3775.257006] nvme_tcp: rq 22 (READ) data_len 131072 bio[1/2] sector
a388600 bvec: nsegs 19 size 77824 offset 0
[ 3775.257009] nvme_tcp: rq 7 (READ) data_len 131072 bio[1/2] sector
a388500 bvec: nsegs 19 size 77824 offset 0
[ 3775.257012] nvme_tcp: rq 8 (READ) data_len 131072 bio[1/2] sector
a388600 bvec: nsegs 19 size 77824 offset 0
[ 3775.257014] nvme_tcp: rq 7 (READ) data_len 131072 bio[1/2] sector
a388500 bvec: nsegs 19 size 77824 offset 0
[ 3775.257017] nvme_tcp: rq 8 (READ) data_len 131072 bio[1/2] sector
a388600 bvec: nsegs 19 size 77824 offset 0
[ 3775.257020] nvme_tcp: rq 6 (READ) data_len 262144 bio[1/4] sector
a388500 bvec: nsegs 19 size 77824 offset 0
[ 3775.262587] nvme_tcp: rq 22 (WRITE) data_len 131072 bio[1/2] sector
a388200 bvec: nsegs 19 size 77824 offset 0
[ 3775.262600] nvme_tcp: rq 22 (WRITE) data_len 131072 bio[2/2] sector
a388298 bvec: nsegs 13 size 53248 offset 0
[ 3775.262610] nvme_tcp: rq 5 (WRITE) data_len 262144 bio[1/4] sector
a388300 bvec: nsegs 19 size 77824 offset 0
[ 3775.262617] nvme_tcp: rq 5 (WRITE) data_len 262144 bio[2/4] sector
a388398 bvec: nsegs 13 size 53248 offset 0
[ 3775.262623] nvme_tcp: rq 5 (WRITE) data_len 262144 bio[3/4] sector
a388400 bvec: nsegs 19 size 77824 offset 0
[ 3775.262629] nvme_tcp: rq 5 (WRITE) data_len 262144 bio[4/4] sector
a388498 bvec: nsegs 13 size 53248 offset 0
[ 3775.262635] nvme_tcp: rq 6 (WRITE) data_len 262144 bio[1/4] sector
a388500 bvec: nsegs 19 size 77824 offset 0
[ 3775.262641] nvme_tcp: rq 6 (WRITE) data_len 262144 bio[2/4] sector
a388598 bvec: nsegs 13 size 53248 offset 0
[ 3775.262647] nvme_tcp: rq 6 (WRITE) data_len 262144 bio[3/4] sector
a388600 bvec: nsegs 19 size 77824 offset 0
[ 3775.262653] nvme_tcp: rq 6 (WRITE) data_len 262144 bio[4/4] sector
a388698 bvec: nsegs 13 size 53248 offset 0
[ 3775.263009] nvme_tcp: rq 5 (WRITE) data_len 131072 bio[1/2] sector
a388300 bvec: nsegs 19 size 77824 offset 0
[ 3775.263019] nvme_tcp: rq 5 (WRITE) data_len 131072 bio[2/2] sector
a388398 bvec: nsegs 13 size 53248 offset 0
[ 3775.263027] nvme_tcp: rq 6 (WRITE) data_len 131072 bio[1/2] sector
a388400 bvec: nsegs 19 size 77824 offset 0
[ 3775.263034] nvme_tcp: rq 6 (WRITE) data_len 131072 bio[2/2] sector
a388498 bvec: nsegs 13 size 53248 offset 0
[ 3775.263040] nvme_tcp: rq 7 (WRITE) data_len 131072 bio[1/2] sector
a388500 bvec: nsegs 19 size 77824 offset 0
[ 3775.263047] nvme_tcp: rq 7 (WRITE) data_len 131072 bio[2/2] sector
a388598 bvec: nsegs 13 size 53248 offset 0
[ 3775.263052] nvme_tcp: rq 8 (WRITE) data_len 131072 bio[1/2] sector
a388600 bvec: nsegs 19 size 77824 offset 0
[ 3775.263059] nvme_tcp: rq 8 (WRITE) data_len 131072 bio[2/2] sector
a388698 bvec: nsegs 13 size 53248 offset 0
[ 3775.264341] nvme_tcp: rq 19 (WRITE) data_len 131072 bio[1/2] sector
a388300 bvec: nsegs 19 size 77824 offset 0
[ 3775.264353] nvme_tcp: rq 19 (WRITE) data_len 131072 bio[2/2] sector
a388398 bvec: nsegs 13 size 53248 offset 0
[ 3775.264361] nvme_tcp: rq 20 (WRITE) data_len 131072 bio[1/2] sector
a388400 bvec: nsegs 19 size 77824 offset 0
[ 3775.264369] nvme_tcp: rq 20 (WRITE) data_len 131072 bio[2/2] sector
a388498 bvec: nsegs 13 size 53248 offset 0
[ 3775.264380] nvme_tcp: rq 21 (WRITE) data_len 131072 bio[1/2] sector
a388500 bvec: nsegs 19 size 77824 offset 0
[ 3775.264387] nvme_tcp: rq 21 (WRITE) data_len 131072 bio[2/2] sector
a388598 bvec: nsegs 13 size 53248 offset 0
[ 3775.264410] nvme_tcp: rq 22 (WRITE) data_len 131072 bio[1/2] sector
a388600 bvec: nsegs 19 size 77824 offset 0

Hao

On Tue, Jan 5, 2021 at 5:53 PM Sagi Grimberg <sagi at grimberg.me> wrote:
>
> Hey Hao,
>
> > Okay, will do that in a few days. Something else just popped up and I
> > have a limited time window to use some machines.
> >
> > BTW, what is the performance implication of disabling merge? My usage
> > pattern is mostly sequential read and write, and write bandwidth is
> > pretty high.
>
> Did you get a chance to look into this?



More information about the Linux-nvme mailing list