[PATCH v2 00/16] block atomic writes

John Garry john.g.garry at oracle.com
Tue Jan 9 01:55:24 PST 2024


On 21/12/2023 06:50, Christoph Hellwig wrote:
> On Tue, Dec 19, 2023 at 04:53:27PM +0000, John Garry wrote:
>> On 19/12/2023 15:17, Christoph Hellwig wrote:
>>> On Tue, Dec 19, 2023 at 12:41:37PM +0000, John Garry wrote:
>>>> How about something based on fcntl, like below? We will prob also require
>>>> some per-FS flag for enabling atomic writes without HW support. That flag
>>>> might be also useful for XFS for differentiating forcealign for atomic
>>>> writes with just forcealign.
>>> I would have just exposed it through a user visible flag instead of
>>> adding yet another ioctl/fcntl opcode and yet another method.
>>>
>> Any specific type of flag?
>>
>> I would suggest a file attribute which we can set via chattr, but that is
>> still using an ioctl and would require a new inode flag; but at least there
>> is standard userspace support.
> I'd be fine with that, but we're kinda running out of flag there.
> That's why I suggested the FS_XFLAG_ instead, which basically works
> the same.

Hi Christoph,

Coming back to this topic... how about this FS_XFLAG_ and fsxattr update:

diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index da43810b7485..9ef15fced20c 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -118,7 +118,8 @@ struct fsxattr {
        __u32           fsx_nextents;   /* nextents field value (get)   */
        __u32           fsx_projid;     /* project identifier (get/set) */
        __u32           fsx_cowextsize; /* CoW extsize field value 
(get/set)*/
-       unsigned char   fsx_pad[8];
+       __u32           fsx_atomicwrites_size; /* unit max */
+       unsigned char   fsx_pad[4];
};

/*
@@ -140,6 +141,7 @@ struct fsxattr {
#define FS_XFLAG_FILESTREAM    0x00004000      /* use filestream 
allocator */
#define FS_XFLAG_DAX           0x00008000      /* use DAX for IO */
#define FS_XFLAG_COWEXTSIZE    0x00010000      /* CoW extent size
allocator hint */
+#define FS_XFLAG_ATOMICWRITES  0x00020000
#define FS_XFLAG_HASATTR       0x80000000      /* no DIFLAG for this   */

/* the read-only stuff doesn't really belong here, but any other place is
lines 1-22/22 (END)

Having FS_XFLAG_ATOMICWRITES set will lead to FMODE_CAN_ATOMIC_WRITE 
being set.

So a user can issue:

 >xfs_io -c "atomic-writes 64K" mnt/file
 >xfs_io -c "atomic-writes" mnt/file
[65536] mnt/file

and then:

/xfs_io -c "lsattr" mnt/file
------------------W mnt/file

(W is new flag for atomic writes obvs)

The user will still have to issue statx to get the actual atomic write 
limit for a file, as 'xfs_io -c "atomic-writes"' does not take into 
account any HW/linux block layer atomic write limits.

FS_XFLAG_ATOMICWRITES will force XFS extent size and alignment to 
fsx_atomicwrites_size when we have HW support, so effectively same as 
forcealign.  For no HW support, we still specify a size. In case of 
possible XFS CoW solution for no atomic write HW support, I suppose that 
there would be no size limit in reality, so the specifying the size 
would only be just for userspace experience consistency.

Is this the sort of userspace API which you would like to see?

John



More information about the Linux-nvme mailing list