[PATCHv11 0/9] write hints with nvme fdp and scsi streams
Christoph Hellwig
hch at lst.de
Mon Nov 11 02:29:14 PST 2024
On Fri, Nov 08, 2024 at 11:36:20AM -0800, Keith Busch wrote:
> Default partition split so partition one gets all the write hints
> exclusively
I still don't think this actually works as expected, as the user
interface says the write streams are contigous, and with the bitmap
they aren't.
As I seem to have a really hard time to get my point across, I instead
spent this morning doing a POC of what I mean, and pushed it here:
http://git.infradead.org/?p=users/hch/misc.git;a=shortlog;h=refs/heads/block-write-streams
The big differences are:
- there is a separate write_stream value now instead of overloading
the write hint. For now it is an 8-bit field for the internal
data structures so that we don't have to grow the bio, but all the
user interfaces are kept at 16 bits (or in case of statx reduced to
it). If this becomes now enough because we need to support devices
with multiple reclaim groups we'll have to find some space by using
unions or growing structures
- block/fops.c is the place to map the existing write hints into
the write streams instead of the driver
- the stream granularity is added, because adding it to statx at a
later time would be nasty. Getting it in nvme is actually amazingly
cumbersome so I gave up on that and just fed a dummy value for
testing, though
- the partitions remapping is now done using an offset into the global
write stream space so that the there is a contiguous number space.
The interface for this is rather hacky, so only treat it as a start
for interface and use case discussions.
- the generic stack limits code stopped stacking the max write
streams. While it does the right thing for simple things like
multipath and mirroring/striping is is wrong for anything non-trivial
like parity raid. I've left this as a separate fold patch for the
discussion.
More information about the Linux-nvme
mailing list