[PATCH v2 00/16] block atomic writes
John Garry
john.g.garry at oracle.com
Thu Jan 11 01:55:36 PST 2024
On 11/01/2024 05:02, Christoph Hellwig wrote:
> On Wed, Jan 10, 2024 at 05:40:56PM -0800, Darrick J. Wong wrote:
>> struct statx statx;
>> struct fsxattr fsxattr;
>> int fd = open('/foofile', O_RDWR | O_DIRECT);
I'm assuming O_CREAT also.
>>
>> ioctl(fd, FS_IOC_GETXATTR, &fsxattr);
>>
>> fsxattr.fsx_xflags |= FS_XFLAG_FORCEALIGN | FS_XFLAG_WRITE_ATOMIC;
>> fsxattr.fsx_extsize = 16384; /* only for hardware no-tears writes */
>>
>> ioctl(fd, FS_IOC_SETXATTR, &fsxattr);
>>
>> statx(fd, "", AT_EMPTY_PATH, STATX_ALL | STATX_WRITE_ATOMIC, &statx);
>>
>> if (statx.stx_atomic_write_unit_max >= 16384) {
>> pwrite(fd, &iov, 1, 0, RWF_SYNC | RWF_ATOMIC);
>> printf("HAPPY DANCE\n");
>> }
>
> I think this still needs a check if the fs needs alignment for
> atomic writes at all. i.e.
>
> struct statx statx;
> struct fsxattr fsxattr;
> int fd = open('/foofile', O_RDWR | O_DIRECT);
>
> ioctl(fd, FS_IOC_GETXATTR, &fsxattr);
> statx(fd, "", AT_EMPTY_PATH, STATX_ALL | STATX_WRITE_ATOMIC, &statx);
> if (statx.stx_atomic_write_unit_max < 16384) {
> bailout();
> }
How could this value be >= 16384 initially? Would it be from
pre-configured FS alignment, like XFS RT extsize? Or is this from some
special CoW-based atomic write support? Or FS block size of 16384?
Incidentally, for consistency only setting FS_XFLAG_WRITE_ATOMIC will
lead to FMODE_CAN_ATOMIC_WRITE being set. So until FS_XFLAG_WRITE_ATOMIC
is set would it make sense to have statx return 0 for
STATX_WRITE_ATOMIC. Otherwise the user may be misled to think that it is
ok to issue an atomic write (when it isn’t).
Thanks,
John
>
> fsxattr.fsx_xflags |= FS_XFLAG_WRITE_ATOMIC;
> if (statx.stx_atomic_write_alignment) {
> fsxattr.fsx_xflags |= FS_XFLAG_FORCEALIGN;
> fsxattr.fsx_extsize = 16384; /* only for hardware no-tears writes */
> }
> if (ioctl(fd, FS_IOC_SETXATTR, &fsxattr) < 1) {
> bailout();
> }
>
> pwrite(fd, &iov, 1, 0, RWF_SYNC | RWF_ATOMIC);
> printf("HAPPY DANCE\n");
>
More information about the Linux-nvme
mailing list