Error when running fio against nvme-of rdma target (mlx5 driver)
Martin Oliveira
Martin.Oliveira at eideticom.com
Thu Feb 10 15:58:52 PST 2022
On 2/9/22 1:41 AM, Chaitanya Kulkarni wrote:
> On 2/8/22 6:50 PM, Martin Oliveira wrote:
> > Hello,
> >
> > We have been hitting an error when running IO over our nvme-of setup, using the mlx5 driver and we are wondering if anyone has seen anything similar/has any suggestions.
> >
> > Both initiator and target are AMD EPYC 7502 machines connected over RDMA using a Mellanox MT28908. Target has 12 NVMe SSDs which are exposed as a single NVMe fabrics device, one physical SSD per namespace.
> >
>
> Thanks for reporting this, if you can bisect the problem on your setup
> it will help others to help you better.
>
> -ck
Hi Chaitanya,
I went back to a kernel as old as 4.15 and the problem was still there, so I don't know of a good commit to start from.
I also learned that I can reproduce this with as little as 3 cards and I updated the firmware on the Mellanox cards to the latest version.
I'd be happy to try any tests if someone has any suggestions.
Thanks,
Martin
More information about the Linux-nvme
mailing list