[PATCH rfc] nvme: support io stats on the mpath device
Christoph Hellwig
hch at lst.de
Sun Oct 30 09:22:53 PDT 2022
On Tue, Oct 25, 2022 at 06:58:19PM +0300, Sagi Grimberg wrote:
>> and even more so the special start/end calls in all
>> the transport drivers.
>
> The end is centralized and the start part has not sprinkled to
> the drivers. I don't think its bad.
Well. We need a new magic helper instead of blk_mq_start_request,
and a new call to nvme_mpath_end_request in the lower driver to
support functionality in the multipath driver that sits above them.
This is because of the hack of storing the start_time in the
nvme_request, which is realy owned by the lower driver, and quite
a bit of a layering violation.
If the multipath driver simply did the start and end itself things
would be a lot better. The upside of that would be that it also
accounts for the (tiny) overhead of the mpath driver. The big
downside would be that we'd have to allocate memory just for the
start_time as nvme-multipath has no per-I/O data structure of it's
own. In a way it would be nice to just have a start_time in
the bio, which would clean up the interface a lot, and
also massively simplify the I/O accounting in md. But Jens might
not be willing to grow the bio for this special case, even if some
things in the bio seem even more obscure.
>> the stats sysfs attributes already have the entirely separate
>> blk-mq vs bio based code pathes. So I think having a block_device
>> operation that replaces part_stat_read_all which allows nvme to
>> iterate over all pathes and collect the numbers would seem
>> a lot nicer. There might be some caveats like having to stash
>> away the numbers for disappearing paths, though.
>
> You think this is better? really? I don't agree with you, I think its
> better to pay a small cost than doing this very specialized thing that
> will only ever be used for nvme-mpath.
Yes, I think a callout at least conceptually would be much better.
More information about the Linux-nvme
mailing list