[PATCH v2] Do not require atomic writes to be power of 2 sized and aligned on length boundary
Vitaliy Filippov
vitalifster at gmail.com
Mon Dec 22 01:54:51 PST 2025
Hi! Thanks a lot for your reply! This is actually my first patch ever
so please don't blame me for not following some standards, I'll try to
resubmit it correctly.
Regarding the rest:
1) NVMe atomic boundaries seem to already be checked in
nvme_valid_atomic_write().
2) What's atomic_write_hw_unit_max? As I understand, Linux also
already checks it, at least
/sys/block/nvme**/queue/atomic_write_max_bytes is already limited by
max_hw_sectors_kb.
3) Yes, I've of course seen that this function is also used by ext4
and xfs, but I don't understand the motivation behind the 2^n
requirement. I suppose file systems may fragment the write according
to currently allocated extents for example, but I don't see how issues
coming from this can be fixed by requiring writes to be 2^n.
But I understand that just removing the check may break something if
somebody relies on them. What do you think about removing the
requirement only for NVMe or only for block devices then? I see 3 ways
to do it:
a) split generic_atomic_write_valid() into two functions - first for
all types of inodes and second only for file systems.
b) remove generic_atomic_write_valid() from block device checks at all.
c) change generic_atomic_write_valid() just like in my original patch
but copy original checks into other places where it's used (ext4 and
xfs).
Which way do you think would be the best?
On Mon, Dec 22, 2025 at 2:17 AM Keith Busch <kbusch at kernel.org> wrote:
>
> On Sun, Dec 21, 2025 at 04:24:02PM +0300, Vitaliy Filippov wrote:
> > It contradicts NVMe specification where alignment is only required when atomic
> > write boundary (NABSPF/NABO) is set and highly limits usage of NVMe atomic writes
>
> Commit header is missing the "fs:" prefix, and the commit log should
> wrap at 72 characters.
>
> On the techincal side, this is a generic function used by multiple
> protocols, so you can't just appeal to NVMe to justify removing the
> checks.
>
> NVMe still has atomic boundaries where straddling it fails to be an
> atomic operation. Instead of removing the checks, you'd have to replace
> it with a more costly operation if you really want to support more
> arbitrary write lengths and offsets. And if you do manage to remove the
> power of two requirement, then the queue limit for nvme's
> atomic_write_hw_unit_max isn't correct anymore.
More information about the Linux-nvme
mailing list