Data corruption when using multiple devices with NVMEoF TCP
Sagi Grimberg
sagi at grimberg.me
Thu Dec 24 12:56:17 EST 2020
> Sagi, thanks a lot for helping look into this.
>
>> Question, if you build the raid0 in the target and expose that over nvmet-tcp (with a single namespace), does the issue happen?
> No, it works fine in that case.
> Actually with this setup, initially the latency was pretty bad, and it
> seems enabling CONFIG_NVME_MULTIPATH improved it significantly.
> I'm not exactly sure though as I've changed too many things and didn't
> specifically test for this setup.
> Could you help confirm that?
>
> And after applying your patch,
> - With the problematic setup, i.e. creating a 2-device raid0, I did
> see numerous numerous prints popping up in dmesg; a few lines are
> pasted below:
> - With the good setup, i.e. only using 1 device, this line also pops
> up, but a lot less frequent.
Hao, question, what is the io-scheduler in-use for the nvme-tcp devices?
Can you try to reproduce this issue when disabling merges on the
nvme-tcp devices?
echo 2 > /sys/block/nvmeXnY/queue/nomerges
I want to see if this is an issue with merged bios.
More information about the Linux-nvme
mailing list