What should we do about the nvme atomics mess?
John Garry
john.g.garry at oracle.com
Tue Jul 8 03:08:33 PDT 2025
On 07/07/2025 15:18, Christoph Hellwig wrote:
> Hi all,
>
> I'm a bit lost on what to do about the sad state of NVMe atomic writes.
>
> As a short reminder the main issues are:
>
> 1) there is no flag on a command to request atomic (aka non-torn)
> behavior, instead writes adhering to the atomicy requirements will
> never be torn, and writes not adhering them can be torn any time.
> This differs from SCSI where atomic writes have to be be explicitly
> requested and fail when they can't be satisfied
> 2) the original way to indicate the main atomicy limit is the AWUPF
> field, which is in Identify Controller, but specified in logical
> blocks which only exist at a namespace layer. This a) lead to
> various problems because the limit is a mess when namespace have
> different logical block sizes, and it b) also causes additional
> issues because NVMe allows it to be different for different
> controllers in the same subsystem.
>
> Commit 8695f060a029 added some sanity checks to deal with issue 2b,
> but we kept running into more issues with it. Partially because
> the check wasn't quite correct, but also because we've gotten
> reports of controllers that change the AWUPF value when reformatting
> namespaces to deal with issue 2a.
>
> And I'm a bit lost on what to do here.
>
> We could:
>
> I. revert the check and the subsequent fixup. If you really want
> to use the nvme atomics you already better pray a lot anyway
> due to issue 1)
> II. limit the check to multi-controller subsystems
> III. don't allow atomics on controllers that only report AWUPF and
> limit support to controllers that support that more sanely
> defined NAWUPF
This would help avoid the ambiguity in whether NABSPF is valid if nsfeat
bit 1 is unset.
However, it would be nice to have an idea of how many/percentage of
products it would affect today. FWIW, I only have 1x SSD which supports
atomics, and it does set that bit.
I suppose we could quirk known "good" HW which relies on AWUPF (to
enable atomics), but that is very far from a nice approach.
>
> I guess for 6.16 we are limited to I. to bring us back to the previous
> state, but I have a really bad gut feeling about it given the really
> bad spec language and a lot of low quality NVMe implementations we're
> seeing these days.
> not the
More information about the Linux-nvme
mailing list