[PATCH v7 0/3] FDP and per-io hints

Keith Busch kbusch at kernel.org
Thu Oct 17 09:23:57 PDT 2024


On Thu, Oct 17, 2024 at 09:15:21AM -0700, Bart Van Assche wrote:
> On 10/17/24 8:44 AM, Keith Busch wrote:
> > On Thu, Oct 17, 2024 at 05:23:37PM +0200, Christoph Hellwig wrote:
> > > If you want to do useful stream separation you need to write data
> > > sequentially into the stream.  Now with streams or FDP that does not
> > > actually imply sequentially in LBA space, but if you want the file
> > > system to not actually deal with fragmentation from hell, and be
> > > easily track what is grouped together you really want it sequentially
> > > in the LBA space as well.  In other words, any kind of write placement
> > > needs to be intimately tied to the file system block allocator.
> > 
> > I'm replying just to make sure I understand what you're saying:
> > 
> > If we send per IO hints on a file, we could have interleaved hot and
> > cold pages at various offsets of that file, so the filesystem needs an
> > efficient way to allocate extents and track these so that it doesn't
> > interleave these in LBA space. I think that makes sense.
> > 
> > We can add a fop_flags and block/fops.c can be the first one to turn it
> > on since that LBA access is entirely user driven.
> 
> Does anyone care about buffered I/O to block devices? When using
> buffered I/O, the write_hint information from the inode is used and the per
> I/O write_hint information is ignored.

I'm pretty sure there are applications that use buffered IO on raw block
(ex: postgresql), but it's a moot point: the block file_operations that
provide the fops_flags also provide the callbacks for O_DIRECT, which is
where this matters.

We can't really use per-io write_hints on buffered-io. At least not yet,
and maybe never. I'm not sure if it makes sense for raw block because
the page writes won't necessarily match writes to storage.



More information about the Linux-nvme mailing list