[PATCH v2] nvme: enable FDP support

Keith Busch kbusch at kernel.org
Tue Jun 11 15:42:09 PDT 2024


On Tue, Jun 11, 2024 at 03:43:11PM -0400, Martin K. Petersen wrote:
d
> 
> Hi Keith!
> 
> > One thing FDP got right was mandating the Endurance Log: the drive
> > must provide a feedback mechanism for the host to know if what they're
> > doing is helpful or harmful.
> 
> Good luck teaching firefox what to do with that information!
> 
> > If you're just blindly throwing random fcntl hints, then you're not
> > the target audience for the feature; you're expected to iterate and
> > tweak your usage.
> 
> And that's exactly my point. What the various attempts at data
> management in the specs have in common is that they are unsuitable for a
> general purpose operating system and its applications.
> 
> We can all come up with a restrictive model which works beautifully for
> one particular application. No problem. But that's not what standards
> are supposed to be about! We used to produce specifications which worked
> for every type of application and device.

FDP isn't a user distro level feature. This is expert level, and is not
reachable by default; you really need admin know-how to make a namespace
that recognizes these semantics. The machines that can reach *this*
feature are most certainly headless servers, so consumer use cases
aren't considered yet! All in good time (maybe).

And as I don asbestos underwear and dare say, Linux enables enterprise
storage capabilities for filesystems not reachable for the average user
(DAX), so we (Linux) are not exactly fencing off difficult to use
features. FDP doesn't even require new kernel interface changes to wire
it up to filesystems, so there's no additional maintenance burden here.
>From a pure block and NVMe maintenance point of view, this is nothing.

I do have gripes with the *kernel* interfaces, though. Mainly that it's
per inode which makes this useless with raw block IO. We started FDP
with the passthrough interface, and that proved usage produces
meaningful gains, so hooking this into existing data separation provided
by fnctl feels like a natural progression despite its limitations.

I think enabling such experimentation can only help make these
interfaces become better and enlighten future protocol changes.
 
> It was a beautiful thing when we went away from cylinders, heads, and
> sectors as tools to do performance management on storage. An abstracted
> model for managing blocks that has worked for everything from USB flash
> drives, over spinning rust, to million dollar storage arrays. With one
> protocol. For decades. And still going. Because the abstraction worked,
> and it removed the burden of having to care about device implementation
> artifacts from applications and operating systems alike.
> 
> We need a similar model for data management. Something which works well
> enough on the device media management side but which transcends one
> particular application or device implementation. I really don't believe
> any of the currently defined data management schemes are timeless the
> same way as LBAs have proven to be...

No disagreement here!

NVMe 1.0 defined Read/Write CDW13 DSM field with what I think are almost
the desired semantics. 12 years later, no one implemented it, but mark
my words: we'll circle back to something similiar in 12 more years.



More information about the Linux-nvme mailing list