[PATCH v7 2/9] fs: Initial atomic write support

John Garry john.g.garry at oracle.com
Wed Jun 5 23:38:41 PDT 2024


On 06/06/2024 06:41, Christoph Hellwig wrote:
> On Wed, Jun 05, 2024 at 11:48:12AM +0100, John Garry wrote:
>> I have no strong attachment to that name (atomic).
>>
>> For both SCSI and NVMe, it's an "atomic" feature and I was basing the
>> naming on that.
>>
>> We could have RWF_NOTEARS or RWF_UNTEARABLE_WRITE or RWF_UNTEARABLE or
>> RWF_UNTORN or similar. Any preference?
> 
> No particular preference between any of the option including atomic.
> Just mumbling out aloud my thoughts :)

Regardless of the userspace API, I think that the block layer 
terminology should match that of the underlying HW technology - so I 
would plan to keep "atomic" in the block layer, including request_queue 
sysfs limits.

If we used RWF_UNTORN, at some level the "atomic" and "untorn" 
terminology would need to interface with one another. If it's going to 
be insane to have RWF_UNTORN from userspace being translated into 
REQ_ATOMIC, then I could keep RWF_ATOMIC.

Someone please decide ....

> 
>> For io_uring/rw.c, we have io_write() -> io_rw_init_file(..., WRITE), and
>> then later we set IOCB_WRITE, so would be neat to use there. But then
>> do_iter_readv_writev() does not set IOCB_WRITE - I can't imagine that
>> setting IOCB_WRITE would do any harm there. I see a similar change in
>> https://lore.kernel.org/linux-fsdevel/167391048988.2311931.1567396746365286847.stgit@warthog.procyon.org.uk/
>>
>> AFAICS, setting IOCB_WRITE is quite inconsistent. From browsing through
>> fsdevel on lore, there was some history in trying to use IOCB_WRITE always
>> instead of iov_iter direction. Any idea what happened to that?
>>
>> I'm just getting the feeling that setting IOCB_WRITE in
>> kiocb_set_rw_flags() is a small part - and maybe counter productive - of a
>> larger job of fixing IOCB_WRITE usage.
> 
> Someone (IIRC Dave H.) want to move it into the iov_iter a while ago.
> I think that is a bad idea - the iov_iter is a data container except
> for the shoehorned in read/write information doesn't describe the
> operation at all.  So using the flag in the iocb seems like the better
> architecture.  But I can understand that you might want to stay out
> of all of this, so let's not touch IOCB_WRITE here.
> 

ok



More information about the Linux-nvme mailing list