[RFC PATCH v2] hostfs: handle idmapped mounts

Glenn Washburn development at efficientek.com
Wed Mar 15 23:20:19 PDT 2023


On 3/4/23 12:01, Christian Brauner wrote:
> On Sat, Mar 04, 2023 at 12:28:46AM -0600, Glenn Washburn wrote:
>> On Thu, 2 Mar 2023 09:39:28 +0100
>> Christian Brauner <brauner at kernel.org> wrote:
>>
>>> On Tue, Feb 28, 2023 at 07:50:02PM -0600, Glenn Washburn wrote:
>>>> Let hostfs handle idmapped mounts. This allows to have the same
>>>> hostfs mount appear in multiple locations with different id
>>>> mappings.
>>>>
>>>> root@(none):/media# id
>>>> uid=0(root) gid=0(root) groups=0(root)
>>>> root@(none):/media# mkdir mnt idmapped
>>>> root@(none):/media# mount -thostfs -o/home/user hostfs mnt
>>>>
>>>> root@(none):/media# touch mnt/aaa
>>>> root@(none):/media# mount-idmapped --map-mount u:`id -u user`:0:1
>>>> --map-mount g:`id -g user`:0:1 /media/mnt /media/idmapped
>>>> root@(none):/media# ls -l mnt/aaa idmapped/aaa -rw-r--r-- 1 root
>>>> root 0 Jan 28 01:23 idmapped/aaa -rw-r--r-- 1 user user 0 Jan 28
>>>> 01:23 mnt/aaa
>>>>
>>>> root@(none):/media# touch idmapped/bbb
>>>> root@(none):/media# ls -l mnt/bbb idmapped/bbb
>>>> -rw-r--r-- 1 root root 0 Jan 28 01:26 idmapped/bbb
>>>> -rw-r--r-- 1 user user 0 Jan 28 01:26 mnt/bbb
>>>>
>>>> Signed-off-by: Glenn Washburn <development at efficientek.com>
>>>> ---
>>>> Changes from v1:
>>>>   * Rebase on to tip. The above commands work and have the results
>>>> expected. The __vfsuid_val(make_vfsuid(...)) seems ugly to get the
>>>> uid_t, but it seemed like the best one I've come across. Is there a
>>>> better way?
>>>
>>> Sure, I can help you with that. ;)
>>
>> Thank you!
>>
>>>>
>>>> Glenn
>>>> ---
>>>>   fs/hostfs/hostfs_kern.c | 13 +++++++------
>>>>   1 file changed, 7 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
>>>> index c18bb50c31b6..9459da99a0db 100644
>>>> --- a/fs/hostfs/hostfs_kern.c
>>>> +++ b/fs/hostfs/hostfs_kern.c
>>>> @@ -786,7 +786,7 @@ static int hostfs_permission(struct mnt_idmap
>>>> *idmap, err = access_file(name, r, w, x);
>>>>   	__putname(name);
>>>>   	if (!err)
>>>> -		err = generic_permission(&nop_mnt_idmap, ino,
>>>> desired);
>>>> +		err = generic_permission(idmap, ino, desired);
>>>>   	return err;
>>>>   }
>>>>   
>>>> @@ -794,13 +794,14 @@ static int hostfs_setattr(struct mnt_idmap
>>>> *idmap, struct dentry *dentry, struct iattr *attr)
>>>>   {
>>>>   	struct inode *inode = d_inode(dentry);
>>>> +	struct user_namespace *fs_userns = i_user_ns(inode);
>>>
>>> Fyi, since hostfs can't be mounted in a user namespace
>>> fs_userns == &init_user_ns
>>> so it doesn't really matter what you use.
>>
>> What would you suggest as preferable?
> 
> I would leave init_user_ns hardcoded as it clearly indicates that hostfs
> can only be mounted in the initial user namespace. Plus, the patch is
> smaller.
> 
>>
>>>>   	struct hostfs_iattr attrs;
>>>>   	char *name;
>>>>   	int err;
>>>>   
>>>>   	int fd = HOSTFS_I(inode)->fd;
>>>>   
>>>> -	err = setattr_prepare(&nop_mnt_idmap, dentry, attr);
>>>> +	err = setattr_prepare(idmap, dentry, attr);
>>>>   	if (err)
>>>>   		return err;
>>>>   
>>>> @@ -814,11 +815,11 @@ static int hostfs_setattr(struct mnt_idmap
>>>> *idmap, }
>>>>   	if (attr->ia_valid & ATTR_UID) {
>>>>   		attrs.ia_valid |= HOSTFS_ATTR_UID;
>>>> -		attrs.ia_uid = from_kuid(&init_user_ns,
>>>> attr->ia_uid);
>>>> +		attrs.ia_uid = __vfsuid_val(make_vfsuid(idmap,
>>>> fs_userns, attr->ia_uid)); }
>>>>   	if (attr->ia_valid & ATTR_GID) {
>>>>   		attrs.ia_valid |= HOSTFS_ATTR_GID;
>>>> -		attrs.ia_gid = from_kgid(&init_user_ns,
>>>> attr->ia_gid);
>>>> +		attrs.ia_gid = __vfsgid_val(make_vfsgid(idmap,
>>>> fs_userns, attr->ia_gid));
>>>
>>> Heh, if you look include/linux/fs.h:
>>>
>>>          /*
>>>           * The two anonymous unions wrap structures with the same
>>> member. *
>>>           * Filesystems raising FS_ALLOW_IDMAP need to use
>>> ia_vfs{g,u}id which
>>>           * are a dedicated type requiring the filesystem to use the
>>> dedicated
>>>           * helpers. Other filesystem can continue to use ia_{g,u}id
>>> until they
>>>           * have been ported.
>>>           *
>>>           * They always contain the same value. In other words
>>> FS_ALLOW_IDMAP
>>>           * pass down the same value on idmapped mounts as they would
>>> on regular
>>>           * mounts.
>>>           */
>>>          union {
>>>                  kuid_t          ia_uid;
>>>                  vfsuid_t        ia_vfsuid;
>>>          };
>>>          union {
>>>                  kgid_t          ia_gid;
>>>                  vfsgid_t        ia_vfsgid;
>>>          };
>>>
>>> this just is:
>>>
>>> attrs.ia_uid = from_vfsuid(idmap, fs_userns, attr->ia_vfsuid));
>>> attrs.ia_gid = from_vfsgid(idmap, fs_userns, attr->ia_vfsgid));
>>
>> Its easy to miss from this patch because of lack of context, but attrs
>> is a struct hostfs_iattr, not struct iattr. And attrs.ia_uid is of type
>> uid_t, not kuid_t. So the above fails to compile. This is why I needed
> 
> Oh, I see. And then that raw value is used by calling
> fchmod()/chmod()/chown()/fchown() and so on. That's rather special.
> Ok, then I know what to do.
> 
>> to wrap make_vfsuid() in __vfsuid_val() (to get the uid_t).
> 
> Right. My point had been - independent of the struct hostfs_iattr issue
> you thankfully pointed out - that make_vfsuid() is wrong here.
> 
> make_vfsuid() is used to map a filesystem wide k{g,u}id_t according to
> the mount's idmapping that operation originated from. But that's done
> by the vfs way before we're calling into the filesystem. For example,
> it's done in chown_common().
> 
> So the value placed in struct iattr (the VFS struct) is already a
> vfs{g,u}id stored in iattr->ia_vfs{g,u}id. So you need to use
> from_vfs{g,u}id() here.
> 
>>
>> I had decided against using from_vfsuid() because then I thought I'd
>> need to use from_kuid() to get the uid_t. And from_kuid() takes the
>> namespace (again), which seemed uglier.
>>
>> Based on this, what do you suggest?
> 
> Ok, so just some details on the background before I paste what I think
> we should do.
> As soon as you support idmapped mounts you at least technically are
Thanks for the detailed explanation. I apologize for not getting back to 
this sooner.

> always dealing with two mappings:
> 
> (1) First, there's the filesystem wide idmapping which is taken from the
>      namespace the filessytem was mounted in. This idmapping is applied
>      when you read the raw uid/gid value from disk and turn into a kuid_t
>      type. That value is persistent and stored in inode->i_{g,u}id. All
>      things that are cached and that can be accessed from multiple mounts
>      concurrently can only ever cache k{g,u}id_t aka filesystem values.
> (2) Whenever we're dealing with an operation that's coming from an
>      idmapped mount we need to take the idmapping of the mount into
>      account. That idmapping is completely separate type struct
>      mnt_idmap that's opaque for filesystems and most of the vfs.
> 
>      That idmapping is used to generate the vfs{g,u}id_t. IOW, translates
>      from the filesystem representation to a mount/vfs representation.
> 
> So, in order to store the correct value on disk we need to invert those
> two idmappings to arrive at the raw value that we want to store:
> (U1) from_vfsuid() // map to the filesystem wide value aka something
>       that we can store in inode->i_{g,u}id and that's cacheable. This is
>       done in setattr_copy().
> (U2) from_kuid() // map the filesystem wide value to the raw value we
>       want to store on disk

It seems to me that there are actually 3 mappings, with the third being 
(U2) above (ie vfsuid_t -> kuid_t). And that from_vfsuid() does mappings 
(1) and (2) above. Is this incorrect?

Whats confusing to me is that from_vfsuid() takes both an idmap and a 
user namespace, so presumably it will handle both mapping types (1) and 
(2). And then there's from_kuid() which takes an idmap, so I thought it 
might also do a type (2) mapping. But looking at the code it doesn't 
seem to ever use its idmap parameter. Can you explain the rational 
behind having from_kuid() take an idmap? Is it legacy that will be 
cleaned up as this code settles down / stabilizes? Or perhaps its

> 
> For nearly all filesystems these steps almost never need to be performed
> explicitly. Instead, dedicated vfs helpers will do this:
> 
> (U1) i_{g,u}id_update() // map to filesystem wide value
> (U2) i_{g,u}id_read() // map to raw on-disk value
> 
> For filesystems that don't support being mounted in namespaces the (U2)
> step is always a nop. So technically there's no difference between:
> 
> (U2) from_kuid() and __kuid_val(kuid)
> 
> but it's cleaner to use the helpers even in that case.
> 
> So given how hostfs works these two steps need to be performed
> explicitly. So I suggest (untested):
> 
> diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
> index c18bb50c31b6..72b7e1bcc32e 100644
> --- a/fs/hostfs/hostfs_kern.c
> +++ b/fs/hostfs/hostfs_kern.c
> @@ -813,12 +813,22 @@ static int hostfs_setattr(struct mnt_idmap *idmap,
>                  attrs.ia_mode = attr->ia_mode;
>          }
>          if (attr->ia_valid & ATTR_UID) {
> +               kuid_t kuid;
> +
>                  attrs.ia_valid |= HOSTFS_ATTR_UID;
> -               attrs.ia_uid = from_kuid(&init_user_ns, attr->ia_uid);
> +               /* Map the vfs id into the filesystem. */
> +               kuid = from_vfsuid(idmap, &init_user_ns, attr->ia_vfsuid);
> +               /* Map the filesystem id to its raw on disk value. */
> +               attrs.ia_uid = from_kuid(&init_user_ns, kuid);

Its interesting that this is what I originally discarded, as an 
unfamiliar reader, it looks like you're doing two namespace mappings. 
But that's not happening because from_kuid() disregards its namespace 
parameter.

I've tested this and it does seems to work. Thanks!

Glenn

>          }
>          if (attr->ia_valid & ATTR_GID) {
> +               kgid_t kgid;
> +
>                  attrs.ia_valid |= HOSTFS_ATTR_GID;
> -               attrs.ia_gid = from_kgid(&init_user_ns, attr->ia_gid);
> +               /* Map the vfs id into the filesystem. */
> +               kgid = from_vfsgid(idmap, &init_user_ns, attr->ia_vfsgid);
> +               /* Map the filesystem id to its raw on disk value. */
> +               attrs.ia_gid = from_kgid(&init_user_ns, kgid);
>          }
>          if (attr->ia_valid & ATTR_SIZE) {
>                  attrs.ia_valid |= HOSTFS_ATTR_SIZE;




More information about the linux-um mailing list