Read speed for a PCIe NVMe SSD is ridiculously slow on a multi-socket machine.

Alexander Shumakovitch shurik at jhu.edu
Fri Mar 24 17:33:14 PDT 2023


Hi Damien,

Just to add to my previous message, I've run the same set of tests on a
small SATA SSD boot drive (Kingston A400) attached to the same system, and
it turned out to be more or less node and I/O mode agnostic, producing
consistent reading speeds of about 450MB/sec in the direct I/O mode and
about 480MB/sec in the cached I/O mode. In particular, the cashed mode on
a "wrong" NUMA node was significantly faster for this SATA SSD drive than
for a NVMe one at about 170MB/sec (both drives are connected to CPU #0).

So my question becomes: why is the NVMe driver susceptible to (very) slow
cached reads, while the AHCI one is not? Are there some fundamental
differences in how AHCI and NVMe block devices handle page cache?

Thank you,

  --- Alex.

On Fri, Mar 24, 2023 at 05:43:42PM +0900, Damien Le Moal wrote:
> It is very unusual to use hdparm, a tool designed mainly for ATA devices, to
> benchmark an nvme device. At the very least, if you really want to measure the
> drive performance, you should add the --direct option (see man hdparm).
> 
> But a better way to test would be to use fio with io_uring or libaio IO engine
> doing multi-job & high QD --direct=1 IOs. That will give you the maximum
> performance of your device. Then remove the --direct=1 option to do buffered
> IOs, which will expose potential issues with your system memory bandwidth.
> 


More information about the Linux-nvme mailing list