Read speed for a PCIe NVMe SSD is ridiculously slow on a multi-socket machine.

Alexander Shumakovitch shurik at jhu.edu
Fri Mar 24 14:38:38 PDT 2023


Thank you, Keith. As I have just written to Damien, I've started testing
this hardware from a live USB stick distro, which included 'taskset', but
not 'numactl'. But given the large amount of RAM on the server in question,
the kernel should have taken care of the memory pinning anyway.

In any case, it looks like the main issue is indeed with access to the read
cache, so now I have to figure out what to do about it.

Thanks,

  --- Alex.

On Fri, Mar 24, 2023 at 01:34:51PM -0600, Keith Busch wrote:
> When writing host->dev, there is no cache coherency to consider so it'll
> always be faster in NUMA situations. Reading dev->host does, and can have
> considerable overhead, though 10x seems a bit high.
> 
> Retrying with Damien's O_DIRECT suggestion is a good idea.
> 
> Also, 'taskset' only pins the CPUs the process schedules on, but not the
> memory node it allocates from. Try 'numactl' instead for local node
> allocations.


More information about the Linux-nvme mailing list