[PATCH RFC 0/1] Add visibility for native NVMe miltipath using debugfs
Nilay Shroff
nilay at linux.ibm.com
Wed Jul 24 23:20:33 PDT 2024
On 7/24/24 20:07, Keith Busch wrote:
> On Mon, Jul 22, 2024 at 03:01:08PM +0530, Nilay Shroff wrote:
>> # cat /sys/kernel/debug/block/nvme1n1/multipath
>> io-policy: queue-depth
>> io-path:
>> --------
>> node path ctrl qdepth ana-state
>> 2 nvme1c1n1 nvme1 1328 optimized
>> 2 nvme1c3n1 nvme3 1324 optimized
>> 3 nvme1c1n1 nvme1 1328 optimized
>> 3 nvme1c3n1 nvme3 1324 optimized
>>
>> The above output was captured while I/O was running and accessing
>> namespace nvme1n1. From the above output, we see that iopolicy is set to
>> "queue-depth". When we have I/O workload running on numa node 2, accessing
>> namespace "nvme1n1", the I/O path nvme1c1n1/nvme1 has queue depth of 1328
>> and another I/O path nvme1c3n1/nvme3 has queue depth of 1324. Both paths
>> are optimized and seems that both paths are equally utilized for
>> forwarding I/O.
>
> You can get the outstanding queue-depth from iostats too, and that
> doesn't rely on queue-depth io policy. It does, however, require stats
> are enabled, but that's probably a more reasonable given than an io
> policy.
>
Yes correct, user could use iostat to find the queue-depth in real-time
when I/O workload is running.
>> The same could be said for workload running on numa
>> node 3.
>
> The output for all numa nodes will be the same regardless of which node
> a workload is running on (the accounting isn't per-node), so I'm not
> sure outputting qdepth again for each node is useful.
Agreed, so in that case we may only show the available I/O paths for
head disk node when I/O policy is set to "queue-depth". In this case,
we don't need to show paths per numa node as you suggested. And then
for each I/O path we can show the "qdepth" once.
IMO, though it's possible to find the queue-depth monitoring the iostat
output, it'd be convenient to have it readily available under one place
where we would add further visibility of multipathing.
Thanks,
--Nilay
More information about the Linux-nvme
mailing list