What should we do about the nvme atomics mess?

John Garry john.g.garry at oracle.com
Tue Jul 8 03:08:33 PDT 2025


On 07/07/2025 15:18, Christoph Hellwig wrote:
> Hi all,
> 
> I'm a bit lost on what to do about the sad state of NVMe atomic writes.
> 
> As a short reminder the main issues are:
> 
>   1) there is no flag on a command to request atomic (aka non-torn)
>      behavior, instead writes adhering to the atomicy requirements will
>      never be torn, and writes not adhering them can be torn any time.
>      This differs from SCSI where atomic writes have to be be explicitly
>      requested and fail when they can't be satisfied
>   2) the original way to indicate the main atomicy limit is the AWUPF
>      field, which is in Identify Controller, but specified in logical
>      blocks which only exist at a namespace layer.  This a) lead to
>      various problems because the limit is a mess when namespace have
>      different logical block sizes, and it b) also causes additional
>      issues because NVMe allows it to be different for different
>      controllers in the same subsystem.
> 
> Commit 8695f060a029 added some sanity checks to deal with issue 2b,
> but we kept running into more issues with it.  Partially because
> the check wasn't quite correct, but also because we've gotten
> reports of controllers that change the AWUPF value when reformatting
> namespaces to deal with issue 2a.
> 
> And I'm a bit lost on what to do here.
> 
> We could:
> 
>   I.	 revert the check and the subsequent fixup.  If you really want
>           to use the nvme atomics you already better pray a lot anyway
> 	 due to issue 1)
>   II.	 limit the check to multi-controller subsystems
>   III.	 don't allow atomics on controllers that only report AWUPF and
>   	 limit support to controllers that support that more sanely
> 	 defined NAWUPF

This would help avoid the ambiguity in whether NABSPF is valid if nsfeat 
bit 1 is unset.

However, it would be nice to have an idea of how many/percentage of 
products it would affect today. FWIW, I only have 1x SSD which supports 
atomics, and it does set that bit.

I suppose we could quirk known "good" HW which relies on AWUPF (to 
enable atomics), but that is very far from a nice approach.

> 
> I guess for 6.16 we are limited to I. to bring us back to the previous
> state, but I have a really bad gut feeling about it given the really
> bad spec language and a lot of low quality NVMe implementations we're
> seeing these days.
>   not the




More information about the Linux-nvme mailing list