[PATCH 17/21] fs: xfs: iomap atomic write support

John Garry john.g.garry at oracle.com
Tue Nov 28 09:42:10 PST 2023


On 28/11/2023 13:56, Christoph Hellwig wrote:
> On Tue, Nov 28, 2023 at 08:56:37AM +0000, John Garry wrote:
>> Are you suggesting some sort of hybrid between the atomic write series you
>> had a few years ago and this solution?
> Very roughly, yes.
> 
>> To me that would be continuing with the following:
>> - per-IO RWF_ATOMIC (and not O_ATOMIC semantics of nothing is written until
>> some data sync)
> Yes.
> 
>> - writes must be a power-of-two and at a naturally-aligned offset
> Where offset is offset in the file? 

ok, fine, it would not be required for XFS with CoW. Some concerns still:
a. device atomic write boundary, if any
b. other FSes which do not have CoW support. ext4 is already being used 
for "atomic writes" in the field - see dubious amazon torn-write prevention.

About b., we could add the pow-of-2 and file offset alignment 
requirement for other FSes, but then need to add some method to 
advertise that restriction.

> It would not require it.  You
> probably want to do it for optimal performance, but requiring it
> feeels rather limited.
> 
>> - relying on atomic write HW support always
> And I think that's where we have different opinions.

I'm just trying to understand your idea and that is not necessarily my 
final opinion.

>  I think the hw
> offload is a nice optimization and we should use it wherever we can.

Sure, but to me it is a concern that we have 2x paths to make robust a. 
offload via hw, which may involve CoW b. no HW support, i.e. CoW always

And for no HW support, if we don't follow the O_ATOMIC model of 
committing nothing until a SYNC is issued, would we allocate, write, and 
later free a new extent for each write, right?

> But building the entire userspace API around it feels like a mistake.
> 

ok, but FWIW it works for the usecases which we know.

>> BTW, we also have rtvol support which does not use forcealign as it already
>> can guarantee alignment, but still does rely on the same principle of
>> requiring alignment - would you want CoW support there also?
> Upstream doesn't have out of place write support for the RT subvolume
> yet.  But Darrick has a series for it and we're actively working on
> upstreaming it.
Yeah, I thought that I heard this.

Thanks,
John



More information about the Linux-nvme mailing list