[PATCH v2 2/2] hostfs: store permissions in extended attributes

Marko Petrović petrovicmarko2006 at gmail.com
Fri Apr 14 10:19:52 PDT 2023


On Fri Apr 14, 2023 at 9:40 AM CEST, Johannes Berg wrote:
> On Fri, 2023-04-14 at 04:33 +0200, Marko Petrović wrote:
> > Fix GID assignment error in uml_chown() and add support for correct
> > behavior when parent directory has SETGID bit.
>
>
> That was the change for 'v2' I guess, but you should document here what
> the new code does and why, probably a good chunk of your cover letter
> could be here :-)
Hello! First, thank you for your feedback!
Ok, when sending the PATCH v3, I will document entire patch here.
>
> > +++ b/fs/hostfs/hostfs.h
> > @@ -37,6 +37,7 @@
> >   * is on, and remove the appropriate bits from attr->ia_mode (attr is a
> >   * "struct iattr *"). -BlaisorBlade
> >   */
> > +extern int use_xattr;
>
>
> That seems like an odd place to put it - the comment above is surely for
> the hostfs_timespec?
Actually it appears to be talking about the missing defines of
ATTR_KILL_SUID and ATTR_KILL_SGID, unrelated to the struct hostfs_timespec.
As for why I put it there, I thought it would be nice not to mix
function declarations with variable declarations.
>
> > --- a/fs/hostfs/hostfs_kern.c
> > +++ b/fs/hostfs/hostfs_kern.c
> > @@ -17,6 +17,7 @@
> >  #include <linux/writeback.h>
> >  #include <linux/mount.h>
> >  #include <linux/namei.h>
> > +#include <linux/uidgid.h>
> >  #include "hostfs.h"
> >  #include <init.h>
> >  #include <kern.h>
> > @@ -40,6 +41,7 @@ static struct kmem_cache *hostfs_inode_cache;
> >  /* Changed in hostfs_args before the kernel starts running */
> >  static char *root_ino = "";
> >  static int append = 0;
> > +int use_xattr;
>
> Not relevant just _yet_, but FYI, I have a patch somewhere to build the
> _user.c part always into the kernel image, and then this needs to be
> over there and exported for the module, I think.
>
> (But I need to see what's the state of my patch)
Ok. In that case, I can add EXPORT_SYMBOL() for these variables.
>
> > @@ -578,7 +587,9 @@ static int hostfs_create(struct mnt_idmap *idmap, struct inode *dir,
> >  	if (name == NULL)
> >  		goto out_put;
> >  
> > -	fd = file_create(name, mode & 0777);
> > +	currentuid = from_kuid(current->cred->user_ns, current->cred->euid);
> > +	currentgid = from_kgid(current->cred->user_ns, current->cred->egid);
> > +	fd = file_create(name, mode & 0777, currentuid, currentgid);
> >  	if (fd < 0)
> >  		error = fd;
> >  	else
>
> That probably somehow interacts with the idmapping, +Glenn & Christian.
>
> Not saying it's wrong, but I don't understand it.
If I understood documentation correctly, filesystems should store userspace
ids and kernel performs translation of userspace id to kernel id when
file is read from the disk.
On the other hand, the variables euid and egid in struct cred are of
type kuid_t and kgid_t so this here performs reverse translation
(from kernel id to userspace id) based on the process' current user
namespace. Then the resulting userspace ids are sent to file_create()
to be stored on the disk.
If I have made some misunderstanding and this code has any error,
I apologise and please inform me so that I can fix it.
>
> > @@ -677,10 +688,14 @@ static int hostfs_mkdir(struct mnt_idmap *idmap, struct inode *ino,
> >  {
> >  	char *file;
> >  	int err;
> > +	unsigned int currentuid;
> > +	unsigned int currentgid;
> >  
> >  	if ((file = dentry_name(dentry)) == NULL)
> >  		return -ENOMEM;
> > -	err = do_mkdir(file, mode);
> > +	currentuid = from_kuid(current->cred->user_ns, current->cred->euid);
> > +	currentgid = from_kgid(current->cred->user_ns, current->cred->egid);
> > +	err = do_mkdir(file, mode, currentuid, currentgid);
>
> Here too.
>
> > +static int uml_chown(const char *pathname, unsigned int owner, unsigned int group)
> > +{
> > +	int status;
>
> Maybe just call that the more commonly used 'ret'? It does seem to
> follow kernel conventions (negative error codes, 0 for success).
Ok. Will be changed.
>
> > +
> > +	if (use_xattr) {
> > +		if (owner != -1) {
> > +			status = setxattr(pathname, "user.umluid", &owner,
> > +							sizeof(unsigned int), 0);
>
> Indentation _seems_ to be off a bit here, but could be my mail client.
I just checked, it is 82 columns wide. Sorry about that, will fix it.
>
> > +			if (status < 0)
> > +				return status;
> > +		}
> > +		if (group != -1) {
> > +			status = setxattr(pathname, "user.umlgid", &group,
> > +							sizeof(unsigned int), 0);
> > +			if (status < 0)
> > +				return status;
> > +		}
> > +		return 0;
>
> The logic here means you can partially succeed with chown even when this
> fails, maybe that's not great?
>
> Also, storing the value of 'owner' and 'group' as binary in host-endian
> format makes this needlessly unportable, though we don't really have UML
> for anything but x86 at this point ...
>
> Maybe for both it'd be better to do something like a 'user.umlperms'
> attribute that contains '<mode>,<uid>,<gid>' as a string? That also
> makes it easier to understand if you end up looking at it from the
> outside.
>
> Additionally - and again that's not relevant now I _think_ - we'd have
> to filter this attribute if hostfs ever supports xattrs inside, no?
> Assuming it would just push them into the same xattr outside, which
> seems reasonable though. Or maybe in that case some name translation
> should happen? I don't think anyone wants to run nested UML with hostfs
> though ...
>
Thank you for your suggestion! I will change format of stored
permissions to one string stored in one extended attribute.

Regarding nested UMLs using hostfs, a simple append of some char
(or string?) to the attribute name when it starts with user.umlperms
should solve the issue. Then first UML would access user.umlperms, UML
inside it would access user.umlperms1, UML inside that UML would access
user.umlperms11, etc. as seen by the host.
>
> > +	} else {
> > +		return chown(pathname, owner, group);
>
> You returned above, so don't really need the "else" here.
>
>
> > +// Remove double slash from the path
>
> Please don't use // comments.
>
> > +static void fix_path(char *path)
> > +{
> > +	int i = 0;
> > +	int skip = 0;
> > +
> > +	while (path[i] != '\0') {
> > +		if (path[i] == '/' && path[i+1] == '/')
> > +			skip = 1;
> > +		path[i] = path[i+skip];
> > +		i++;
> > +	}
> > +}
>
> Why is this needed? I don't think the host should have an issue with
> doubled slashes?
It appears that it really isn't needed. It was a leftover from earlier
attempt to fix some problem and then I assumed that maybe functions in
the rest of the code work using file descriptors so I left this just in
case.
Anyway, will remove it.
>
> > +// path is in the form "rootfs//var/abc"
> > +static long is_set_gid(const char *path)
> > +{
> > +	int i = strlen(path) + 1;
> > +	char *parent = mmap(NULL, i, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
>
> Why is this mmap()? We have uml_kmalloc() and I think that should work?
That was a temporary fix... I couldn't find identifier for function for
memory allocation in this part of the kernel. That will be changed,
thanks for information.
>
> > +	struct stat64 buf = { 0 };
> > +
> > +	strcpy(parent, path);
> > +	i = i - 3;
>
> why -3?
i is initially strlen(path) + 1, i.e. the length of the whole string
including the null byte. Then i-1 is the index of the null byte and i-2
is the index of the last character, but last charcter cannot be the
slash that separates file from it's parent directory, so search starts
from i-3.
>
> > +	while (parent[i] != '/')
> > +		i--;
> > +	parent[i] = '\0';
> > +	fix_path(parent);
> > +
> > +	stat64(parent, &buf);
>
> Is it even worth doing the stat on the host? if the S_ISGID is set
> there, it'll be done anyway by the host no? IOW, don't you only need to
> do the handling for 'internal' xattr permissions?
>
Yes, it will be done by the host but it will then be overridden by the
permissions in xattr so UML would present wrong GID to UML processes.
To fix that, UML does stat on the file to determine whether S_ISGID is
set there so that xattr permissions reflect that.

> >  	if (attrs->ia_valid & HOSTFS_ATTR_MODE) {
> >  		if (fd >= 0) {
> > -			if (fchmod(fd, attrs->ia_mode) != 0)
> > +			if (uml_fchmod(fd, attrs->ia_mode) != 0)
> >  				return -errno;
>
> The error handling here feels confusing now - it would seem to me that
> it's better to make it so that uml_fchmod() already returns the -errno
> in the appropriate places, and then use the returned value here?
>
> Otherwise there's just so much more code between the call and reading
> errno that it gets harder to follow, IMHO.
Ok. That will be fixed.
>
> johannes
Best regards,
Marko Petrovic




More information about the linux-um mailing list