Enabling poll_queues on NVME with kernel-5.x
Varad Gautam
varadgautam at gmail.com
Thu Jul 23 04:59:21 EDT 2020
Hi,
Since commit a4668d9ba ("nvme: default to 0 poll queues") [1], the
nvme driver needs to be explicitly configured with poll_queues > 0 to
allow enabling io_poll.
However, prior to poll queues separation in 4b04cc6a8 ("nvme: add
separate poll queue map") [2], io_poll was enabled by default on nvme
block devices.
This is leading to higher io latencies on nvme drives by default
(nvme.poll_queues=0, io_poll=0), visible with fio slat/clat/lat below.
The commit [1] says:
> We need a better way of configuring this, and given that polling is
> (still) a bit niche, let's default to using 0 poll queues.
Are there any plans / work needed for nvme to provide > 0 poll_queues
by default?
kernel-5.4, io_poll=0
---------------------
bash-4.1$ sudo fio /tmp/fio-workload
fio-workload: (g=0): rw=randrw, bs=16K-16K/16K-16K, ioengine=libaio, iodepth=1
fio 1.55
Starting 1 process
Jobs: 1 (f=3), CR=2000/0 IOPS: [m] [97.3% done] [17006K/17219K /s]
[1038 /1051 iops] [eta 00m:08s]]
fio-workload: (groupid=0, jobs=1): err= 0: pid=51062
read : io=4564.1MB, bw=16013KB/s, iops=1000 , runt=291918msec
slat (usec): min=3 , max=75 , avg= 5.61, stdev= 2.33
clat (usec): min=59 , max=2584 , avg=173.86, stdev=307.87
lat (usec): min=78 , max=2590 , avg=180.07, stdev=307.88
bw (KB/s) : min=13140, max=19040, per=100.21%, avg=16045.14, stdev=1015.87
write: io=4565.4MB, bw=16014KB/s, iops=1000 , runt=291918msec
slat (usec): min=4 , max=86 , avg= 6.33, stdev= 2.85
clat (usec): min=3 , max=420 , avg=39.54, stdev= 5.31
lat (usec): min=40 , max=427 , avg=46.49, stdev= 6.53
bw (KB/s) : min=13090, max=19104, per=100.22%, avg=16049.96, stdev=1055.02
cpu : usr=0.76%, sys=2.24%, ctx=587979, majf=0, minf=183
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued r/w/d: total=292153/292180/0, short=0/0/0
lat (usec): 4=0.01%, 10=0.01%, 20=0.01%, 50=48.66%, 100=11.94%
lat (usec): 250=35.31%, 500=2.28%, 750=0.24%, 1000=0.18%
lat (msec): 2=0.72%, 4=0.67%
Run status group 0 (all jobs):
READ: io=4564.1MB, aggrb=16012KB/s, minb=16397KB/s, maxb=16397KB/s,
mint=291918msec, maxt=291918msec
WRITE: io=4565.4MB, aggrb=16014KB/s, minb=16398KB/s, maxb=16398KB/s,
mint=291918msec, maxt=291918msec
Disk stats (read/write):
nvme0n1: ios=292073/292298, merge=0/135, ticks=50570/11593,
in_queue=0, util=35.54%
kernel-5.4 nvme.poll_queues=32 io_poll=1
----------------------------------------
bash-4.1$ sudo fio /tmp/fio-workload
fio-workload: (g=0): rw=randrw, bs=16K-16K/16K-16K, ioengine=libaio, iodepth=1
fio 1.55
Starting 1 process
Jobs: 1 (f=3), CR=2000/0 IOPS: [m] [97.3% done] [16809K/16678K /s]
[1026 /1018 iops] [eta 00m:08s]
fio-workload: (groupid=0, jobs=1): err= 0: pid=11017
read : io=4565.5MB, bw=16009KB/s, iops=1000 , runt=291994msec
slat (usec): min=3 , max=81 , avg= 5.41, stdev= 2.32
clat (usec): min=60 , max=2593 , avg=165.01, stdev=309.91
lat (usec): min=76 , max=2598 , avg=171.04, stdev=309.94
bw (KB/s) : min=13076, max=19008, per=100.20%, avg=16041.43, stdev=926.95
write: io=4565.2MB, bw=16010KB/s, iops=1000 , runt=291994msec
slat (usec): min=3 , max=84 , avg= 6.04, stdev= 2.78
clat (usec): min=2 , max=280 , avg=36.30, stdev= 4.14
lat (usec): min=37 , max=286 , avg=42.96, stdev= 5.19
bw (KB/s) : min=13085, max=19168, per=100.21%, avg=16042.78, stdev=978.02
cpu : usr=0.68%, sys=2.28%, ctx=587989, majf=0, minf=186
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued r/w/d: total=292163/292170/0, short=0/0/0
lat (usec): 4=0.01%, 10=0.01%, 20=0.01%, 50=49.39%, 100=21.80%
lat (usec): 250=25.09%, 500=2.11%, 750=0.13%, 1000=0.12%
lat (msec): 2=0.65%, 4=0.70%
Run status group 0 (all jobs):
READ: io=4565.5MB, aggrb=16009KB/s, minb=16393KB/s, maxb=16393KB/s,
mint=291994msec, maxt=291994msec
WRITE: io=4565.2MB, aggrb=16009KB/s, minb=16393KB/s, maxb=16393KB/s,
mint=291994msec, maxt=291994msec
Disk stats (read/write):
nvme0n1: ios=292051/292222, merge=0/143, ticks=47967/10589,
in_queue=0, util=34.31%
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=a4668d9ba
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=4b04cc6a8
Thanks,
Varad Gautam
More information about the Linux-nvme
mailing list