[PATCH v7 0/3] FDP and per-io hints

Christoph Hellwig hch at lst.de
Thu Oct 10 02:20:10 PDT 2024


On Thu, Oct 10, 2024 at 09:13:27AM +0200, Javier Gonzalez wrote:
> Is this because RocksDB already does seggregation per file itself? Are
> you doing something specific on XFS or using your knoledge on RocksDB to
> map files with an "unwritten" protocol in the midde?

XFS doesn't really do anything smart at all except for grouping files
with similar temperatures, but Hans can probably explain it in more
detail.  So yes, this relies on the application doing the data separation
and using the most logical vehicle for it: files.

>
>    In this context, we have collected data both using FDP natively in
>    RocksDB and using the temperatures. Both look very good, because both
>    are initiated by RocksDB, and the FS just passes the hints directly
>    to the driver.
>
> I ask this to understand if this is the FS responsibility or the
> application's one. Our work points more to letting applications use the
> hints (as the use-cases are power users, like RocksDB). I agree with you
> that a FS could potentially make an improvement for legacy applications
> - we have not focused much on these though, so I trust you insights on
> it.

As mentioned multiple times before in this thread this absolutely
depends on the abstraction level of the application.  If the application
works on a raw device without a file system it obviously needs very
low-level control.  And in my opinion passthrough is by far the best
interface for that level of control.  If the application is using a
file system there is no better basic level abstraction than a file,
which can then be enhanced with relatively small amount of additional
information going both ways: the file system telling the application
what good file sizes and write patterns are, and the application telling
the file system what files are good candidates to merge into the same
write stream if the file system has to merge multiple actively written
to files into a write stream.  Trying to do low-level per I/O hints
on top of a file system is a recipe for trouble because you now have
to entities fighting over placement control.




More information about the Linux-nvme mailing list