[PATCH v2] nvme: enable FDP support

Keith Busch kbusch at kernel.org
Tue Jun 11 07:32:23 PDT 2024


On Tue, Jun 11, 2024 at 07:47:26AM +0200, Christoph Hellwig wrote:
> On Mon, Jun 10, 2024 at 08:52:12AM -0600, Keith Busch wrote:
> > I agree the FDP setup is complicated, but none of that is taken on by
> > the driver. It just discovers the capabilities and maps an arbitrary
> > software "hint" to an arbitrary device "hint". It's up to the
> > application to use those optimally; the driver just performs the
> > requested mapping.
> > 
> > Is it because the names of those hints indicate data lifetime? These are
> > just arbitrary numbers used by applications to separate placement. If
> > they were called HINT_A, HINT_B, HINT_C, would that make this ok?
> 
> No, the other problem is that FDP very much has an implicit contract
> that the host actually aligns to it resources units, and actually
> has a really complicated mangement.  It's not a simple throw a
> lifetime hint at the drive.  Note that the implicit is indeed very
> implicit - it is a really horrible spec with a lot of assumptions
> but nothing actually enforcing it.  If you just use it for dumb
> lifetime hints changes are that you actually increase write
> amplificiation.

NVMe has various features that recommend many things, but none of them
are enforced (see NOWS, NPDA, NPWA, etc...). We expect applications to
act in good faith, but nothing is enforced at the protocol level because
it is difficult to bring on useful software otherwise. You just won't
maximize benefits if you don't align, and FDP is no different.

The fact that setting up a device to use FDP is such a pain is a clear
indication of the user's intentions and responisibilities for using
it. Yeah, a degenerate application abusing FDP semantics is worse for
performance and device wear than doing nothing at all, but why should
anyone care about that?

One thing FDP got right was mandating the Endurance Log: the drive must
provide a feedback mechanism for the host to know if what they're doing
is helpful or harmful. If you're just blindly throwing random fcntl
hints, then you're not the target audience for the feature; you're
expected to iterate and tweak your usage.



More information about the Linux-nvme mailing list