Performance Regression due to ASPM disable patch

Anuj Gupta anuj20.g at samsung.com
Wed Jul 12 08:55:21 PDT 2023


Hi,

I see a performance regression for read/write workloads on our NVMe over
fabrics using TCP as transport setup.
IOPS drop by 23% for 4k-randread [1] and by 18% for 4k-randwrite [2].

I bisected and found that the commit
e1ed3e4d91112027b90c7ee61479141b3f948e6a ("r8169: disable ASPM during
NAPI poll") is the trigger.
When I revert this commit, the performance drop goes away.

The target machine uses a realtek ethernet controller - 
root at testpc:/home/test# lspci | grep -i eth
29:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Device 2600
(rev 21)
2a:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Killer
E3000 2.5GbE Controller (rev 03)

I tried to disable aspm by passing "pcie_aspm=off" as boot parameter and
by setting pcie aspm policy to performance. But it didn't improve the
performance.
I wonder if this is already known, and something different should be
done to handle the original issue? 

[1] fio randread
fio -direct=1 -iodepth=1 -rw=randread -ioengine=psync -bs=4k -numjobs=1
-runtime=30 -group_reporting -filename=/dev/nvme1n1 -name=psync_read
-output=psync_read
[2] fio randwrite
fio -direct=1 -iodepth=1 -rw=randwrite -ioengine=psync -bs=4k -numjobs=1
-runtime=30 -group_reporting -filename=/dev/nvme1n1 -name=psync_read
-output=psync_write


More information about the Linux-nvme mailing list