Error when running fio against nvme-of rdma target (mlx5 driver)

Martin Oliveira Martin.Oliveira at eideticom.com
Thu Feb 10 15:58:52 PST 2022


On 2/9/22 1:41 AM, Chaitanya Kulkarni wrote:
> On 2/8/22 6:50 PM, Martin Oliveira wrote:
> > Hello,
> >
> > We have been hitting an error when running IO over our nvme-of setup, using the mlx5 driver and we are wondering if anyone has seen anything similar/has any suggestions.
> >
> > Both initiator and target are AMD EPYC 7502 machines connected over RDMA using a Mellanox MT28908. Target has 12 NVMe SSDs which are exposed as a single NVMe fabrics device, one physical SSD per namespace.
> >
> 
> Thanks for reporting this, if you can bisect the problem on your setup
> it will help others to help you better.
> 
> -ck

Hi Chaitanya,

I went back to a kernel as old as 4.15 and the problem was still there, so I don't know of a good commit to start from.

I also learned that I can reproduce this with as little as 3 cards and I updated the firmware on the Mellanox cards to the latest version.

I'd be happy to try any tests if someone has any suggestions.

Thanks,
Martin


More information about the Linux-nvme mailing list