[PATCH] fs: remove power of 2 and length boundary atomic write restrictions
Vitaliy Filippov
vitalifster at gmail.com
Mon Jan 5 10:58:58 PST 2026
>What good is that to a user?
It will allow him to use the feature which he currently can't use.
I don't understand your point about "arbitrary" failures.
Imagine that a user just sends a 256 KB write with RWF_ATOMIC while
the device has NAWUPF=128 KB.
He gets EINVAL even though the write is 2^N and length-aligned. Is it
any different from an 'arbitrary failure' which you describe?
Now imagine that he sends a write but it spans multiple extents in the
FS. And he gets EINVAL once again.
Is it any different from what I propose?
Obviously in all of these cases the app has to make sure that it
satisfies all atomic write requirements before actually using them. I
think it's absolutely fine.
On Fri, Jan 2, 2026 at 8:41 PM John Garry <john.g.garry at oracle.com> wrote:
>
> On 30/12/2025 09:01, Vitaliy Filippov wrote:
> > I think that even with the 2^N requirement the user still has to look
> > for boundaries.
> > 1) NVMe disks may have NABO != 0 (atomic boundary offset). In this
> > case 2^N aligned writes won't work at all.
>
> We don't support NABO != 0
>
> > 2) NABSPF is expressed in blocks in the NVMe spec and it's not
> > restricted to 2^N, it can be for example 3 (3*4096 = 12 KB). The spec
> > allows it. 2^N breaks this case too.
>
> We could support NABSPF which is not a power-of-2, but we don't today.
>
> If you can find some real HW which has NABSPF which is not a power-of-2,
> then it can be considered.
>
> > And the user also has to look for the maximum atomic write size
> > anyway, he can't just assume all writes are atomic out of the box,
> > regardless of the 2^N requirement.
> > So my idea is that the kernel's task is just to guarantee correctness
> > of atomic writes. It anyway can't provide the user with atomic writes
> > in all cases.
>
> What good is that to a user?
>
> Consider the user wants to atomic write a range of a file which is
> backed by disk blocks which straddle a boundary - in this case, the
> write would fail. What is the user supposed to do then? That API could
> have arbitrary failures, which effectively makes it a useless API.
>
> As I said before, just don't use RWF_ATOMIC if you don't want to deal
> with these restrictions.
More information about the Linux-nvme
mailing list