[PATCH v7 0/3] FDP and per-io hints
Keith Busch
kbusch at kernel.org
Tue Oct 15 08:09:20 PDT 2024
On Tue, Oct 15, 2024 at 07:50:06AM +0200, Christoph Hellwig wrote:
> 1) While the current per-file temperature hints interface is not perfect
> it is okay and make sense to reuse until we need something more fancy.
> We make good use of it in f2fs and the upcoming zoned xfs code to help
> with data placement and have numbers to show that it helps.
So we're okay to proceed with patch 1?
> 2) A per-I/O interface to set these temperature hint conflicts badly
> with how placement works in file systems. If we have an urgent need
> for it on the block device it needs to be opt-in by the file operations
> so it can be enabled on block device, but not on file systems by
> default. This way you can implement it for block device, but not
> provide it on file systems by default. If a given file system finds
> a way to implement it it can still opt into implementing it of course.
If we add a new fop_flag that only block fops enables, then it's okay?
> 3) Mapping from temperature hints to separate write streams needs to
> happen above the block layer, because file systems need to be in
> control of it to do intelligent placement. That means if you want to
> map from temperature hints to stream separation it needs to be
> implemented at the file operation layer, not in the device driver.
> The mapping implemented in this series is probably only useful for
> block devices. Maybe if dumb file systems want to adopt it, it could
> be split into library code for reuse, but as usual that's probably
> best done only when actually needed.
IMO, I don't even think the io_uring per-io hint needs to be limited to
the fcntl lifetime values. It could just be a u16 value opaque to the
block layer that just gets forwarded to the device.
> 4) To support this the block layer, that is bios and requests need
> to support a notion of stream separation. Kanchan's previous series
> had most of the bits for that, it just needs to be iterated on.
>
> All of this could have probably be easily done in the time spent on
> this discussion.
More information about the Linux-nvme
mailing list