nvmeof Issues with Zen 3/Ryzen 5000 Initiator

Jonathan Wright jonathan at knownhost.com
Mon Jun 21 09:06:26 PDT 2021


Adding Nikolova, Tatyana E and Ismail, Mustafa to CC per auto-responder 
from @Shiraz.

On 6/21/21 11:04 AM, Jonathan Wright wrote:
>
> On 6/3/21 11:57 AM, Jonathan Wright wrote:
>>
>>>> I've been testing NVMe over Fabrics for the past few weeks and the 
>>>> performance has been nothing short of incredible, though I'm 
>>>> running into some major issues that seems to be specifically 
>>>> related to AMD Zen 3 Ryzen chips (in my case I'm testing with 5900x).
>>>>
>>>> Target:
>>>> Supermicro X10 board
>>>> Xeon E5-2620v4
>>>> Intel E810 NIC
>>>>
>>>> Problematic Client/initiator:
>>>> ASRock X570 board
>>>> Ryzen 9 5900x
>>>> Intel E810 NIC
>>>>
>>>> Stable Client/initiator:
>>>> Supermicro X10 board
>>>> Xeon E5-2620v4
>>>> Intel E810 NIC
>>>>
>>>> I'm using the same 2 E810 NICs and pair of 25G DACs in both cases.  
>>>> The NICs are directly connected with the DACs and there is no 
>>>> switch in the equation.  To trigger the issue I'm simply using FIO 
>>>> similar to this:
>>>>
>>>> fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 
>>>> --name=test --filename=/dev/nvme0n1 --bs=4k --iodepth=64 --size=10G 
>>>> --readwrite=randread --time_based --runtime=1200
>>>>
>>>> I'm primarily using RDMA/iWARP right now but I've also tested RoCE2 
>>>> which presents the same issues/symptoms. Primary testing has been 
>>>> done with Ubuntu 20.04.2 with CentOS 8 in the mix as well just to 
>>>> try and rule out a weird distro-specific issue. All tests used the 
>>>> latest ice/irdma drivers from Intel (1.5.8 and 1.5.2 respectively)
>>>
>>> CCing Shiraz Saleem who maintains irdma.
>>
>> Thanks.  I've done some testing now with Mellanox ConnectX-4 cards 
>> with Zen 3 and the issue does not exist.  This seems to point the 
>> finger to something specific between irdma and Zen 2/3 since 
>> irdma/E810 works fine on all-Intel hardware.  I tested the Mellanox 
>> on both Ubuntu 20.04 stock kernel (5.4) and CentOS 8.3 (stock kernel 
>> 4.18) as I tested the E810 on these combinations.
>>
>> Further I tested an AMD target with an Intel initiator and the issue 
>> still exists so it doesn't seem to matter which end the Zen 3 (and/or 
>> Zen2) chip is on when paired with an E810/irdma.
>>
>> The issue also exists with Zen 2 (Ryzen 3600).
>>
>> @Shriaz since I guess this isn't a common setup right now let me know 
>> if I can be of any assistance with getting to the bottom of this 
>> seeming incompatibility.
>
> @Intel is there any desire to fix this currently?  I'll be sending 
> this current lab equipment with the Zen3 systems to other tasks soon 
> and will no longer be able to help troubleshoot or test. Without a fix 
> we'll have to look into other vendors for fabric setups.
>
-- 
Jonathan Wright
KnownHost, LLC
https://www.knownhost.com




More information about the Linux-nvme mailing list