[PATCH] vfs: remove the excl argument from the ->create() inode_operation
NeilBrown
neilb at ownmail.net
Thu Nov 6 16:00:34 PST 2025
On Fri, 07 Nov 2025, Jeff Layton wrote:
> On Thu, 2025-11-06 at 07:07 -0500, Jeff Layton wrote:
> > On Thu, 2025-11-06 at 08:23 +1100, NeilBrown wrote:
> > > On Thu, 06 Nov 2025, Jeff Layton wrote:
> > > > Since ce8644fcadc5 ("lookup_open(): expand the call of vfs_create()"),
> > > > the "excl" argument to the ->create() inode_operation is always set to
> > > > true. Remove it, and fix up all of the create implementations.
> > >
> > > nonono
> > >
> > >
> > > > @@ -3802,7 +3802,7 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
> > > > }
> > > >
> > > > error = dir_inode->i_op->create(idmap, dir_inode, dentry,
> > > > - mode, open_flag & O_EXCL);
> > > > + mode);
> > >
> > > "open_flag & O_EXCL" is not the same as "true".
> > >
> > > It is true that "all calls to vfs_create() pass true for 'excl'"
> > > The same is NOT true for inode_operations.create.
> > >
> >
> > I don't think this is a problem, actually:
> >
> > Almost all of the existing ->create() operations ignore the "excl"
> > bool. There are only two that I found that do not: NFS and GFS2. Both
> > of those have an ->atomic_open() operation though, so lookup_open()
> > will never call ->create() for those filesystems. This means that -
> > > create() _is_ always called with excl == true.
>
> How about this for a revised changelog, which makes the above clear:
>
> vfs: remove the excl argument from the ->create() inode_operation
>
> Since ce8644fcadc5 ("lookup_open(): expand the call of vfs_create()"),
> the "excl" argument to the ->create() inode_operation is always set to
> true in vfs_create().
>
> There is another call to ->create() in lookup_open() that can set it to
> either true or false. All of the ->create() operations in the kernel
> ignore the excl argument, except for NFS and GFS2. Both NFS and GFS2
> have an ->atomic_open() operation, however so lookup_open() will never
> call ->create() on those filesystems.
>
> Remove the "excl" argument from the ->create() operation, and fix up the
> filesystems accordingly.
Thanks, that is a substantial improvement. I see your point now and I
think this is a really nice cleanup to make - thanks.
I think the commit message could be improved further by leading with the
detail that is central - that most ->create function ignore 'excl'.
With two exceptions, ->create() methods provided by filesystems ignore
the "excl" flag. Those exception are NFS and GFS2 which both also
provide ->atomic_open.
excl is always true when ->create is called from vfs_create() (since
commit......) so the only time it can be false is when it is called by
lookup_open() for filesystems that do not provide ->atomic_open.
So the excl flag to ->create is either ignored or true. So we can
remove it and change NFS and GFS2 to acts as though it were true.
>
> Maybe we also need some comments or updates to Documentation/ to make
> it clear that ->create() always implies O_EXCL semantics?
Definitely, something in porting.rst and something in vfs.rst.
I would be worth saying somewhere that if the fs needs to mediate
non-exclusive creation, it must provide atomic_open().
Thanks,
NeilBrown
> --
> Jeff Layton <jlayton at kernel.org>
>
More information about the linux-um
mailing list