[PATCH v7 2/7] um: use execveat to create userspace MMs

Benjamin Berg benjamin at sipsolutions.net
Thu Jul 4 10:39:37 PDT 2024


On Thu, 2024-07-04 at 18:49 +0200, Johannes Berg wrote:
> On Thu, 2024-07-04 at 18:27 +0200, Benjamin Berg wrote:
> > 
> > +	/* set a nice name */
> > +	stub_syscall2(__NR_prctl, PR_SET_NAME, (unsigned long)"uml-userspace");
> 
> Is that even needed when you're passing it as argv[0] in execve()? But
> whatever, it's fine, just wondering.

It is needed. I added it because the argv[0] was not being used and I
ended up with a number as the process name.

Benjamin

> > +	/* setup signal stack inside stub data */
> > +	stack.ss_flags = 0;
> > +	stack.ss_size = STUB_DATA_PAGES * UM_KERN_PAGE_SIZE;
> > +	stack.ss_sp = (void *)init_data.stub_start + UM_KERN_PAGE_SIZE;
> > +	stub_syscall2(__NR_sigaltstack, (unsigned long)&stack, 0);
> > +
> > +	/* register SIGSEGV handler (SA_RESTORER, the handler never returns) */
> > +	sa.sa_flags = SA_ONSTACK | SA_NODEFER | SA_SIGINFO | 0x04000000;
> > +	sa.sa_handler_ = (void *) init_data.segv_handler;
> > +	sa.sa_restorer = NULL;
> > +	sa.sa_mask = 0L; /* No need to mask anything */
> 
> most of that init could be in the initializer, except the dynamic ones
> of course; the NULL/0 is also unnecessary I guess (though might want the
> sa_mask for the comment)
> 
> > +	struct stub_init_data init_data = {
> > +		.stub_start = STUB_START,
> > +		.segv_handler = STUB_CODE +
> > +				(unsigned long) stub_segv_handler -
> > +				(unsigned long) __syscall_stub_start,
> > +	};
> > +	struct iomem_region *iomem;
> > +	int ret;
> > +
> > +	init_data.stub_code_fd = phys_mapping(uml_to_phys(__syscall_stub_start),
> > +					      &offset);
> > +	init_data.stub_code_offset = MMAP_OFFSET(offset);
> > +
> > +	init_data.stub_data_fd = phys_mapping(uml_to_phys(stack), &offset);
> > +	init_data.stub_data_offset = MMAP_OFFSET(offset);
> 
> also could move more here into the initializer
> 
> > +static int __init init_stub_exe_fd(void)
> > +{
> > +	size_t len = 0;
> 
> maybe that should be called 'written'?
> 
> > +	int res;
> 
> and I technically that should be ssize_t for the write() return value,
> not that it'll be big enough to matter
> 
> > +	while (len < stub_exe_end - stub_exe_start) {
> > +		res = write(stub_exe_fd, stub_exe_start + len,
> > +			    stub_exe_end - stub_exe_start - len);
> > +		if (res < 0) {
> > +			if (errno == EINTR)
> > +				continue;
> > +
> > +			if (tmpfile)
> > +				unlink(tmpfile);
> > +			panic("%s: Failed write to memfd: %d", __func__, errno);
> 
> nit: not always memfd now
> 
> > +	if (!tmpfile) {
> > +		fcntl(stub_exe_fd, F_ADD_SEALS,
> > +		      F_SEAL_WRITE | F_SEAL_SHRINK | F_SEAL_GROW | F_SEAL_SEAL);
> > +	} else {
> > +		/* Only executable by us */
> > +		if (fchmod(stub_exe_fd, 00500) < 0) {
> 
> now it's also readable, so comment doesn't seem right? maybe just remove
> it?
> 
> > +			unlink(tmpfile);
> > +			panic("Could not make stub binary excutable: %d",
> > +			      errno);
> 
> perhaps print -errno?
> 
> > +		}
> > +
> > +		close(stub_exe_fd);
> > +		stub_exe_fd = open(tmpfile, O_RDONLY | O_CLOEXEC | O_NOFOLLOW);
> > +		if (stub_exe_fd < 0) {
> > +			unlink(tmpfile);
> > +			panic("Could not reopen stub binary: %d", errno);
> 
> same here
> 
> johannes
> 




More information about the linux-um mailing list