[PATCH v7 2/7] um: use execveat to create userspace MMs
Benjamin Berg
benjamin at sipsolutions.net
Thu Jul 4 10:39:37 PDT 2024
On Thu, 2024-07-04 at 18:49 +0200, Johannes Berg wrote:
> On Thu, 2024-07-04 at 18:27 +0200, Benjamin Berg wrote:
> >
> > + /* set a nice name */
> > + stub_syscall2(__NR_prctl, PR_SET_NAME, (unsigned long)"uml-userspace");
>
> Is that even needed when you're passing it as argv[0] in execve()? But
> whatever, it's fine, just wondering.
It is needed. I added it because the argv[0] was not being used and I
ended up with a number as the process name.
Benjamin
> > + /* setup signal stack inside stub data */
> > + stack.ss_flags = 0;
> > + stack.ss_size = STUB_DATA_PAGES * UM_KERN_PAGE_SIZE;
> > + stack.ss_sp = (void *)init_data.stub_start + UM_KERN_PAGE_SIZE;
> > + stub_syscall2(__NR_sigaltstack, (unsigned long)&stack, 0);
> > +
> > + /* register SIGSEGV handler (SA_RESTORER, the handler never returns) */
> > + sa.sa_flags = SA_ONSTACK | SA_NODEFER | SA_SIGINFO | 0x04000000;
> > + sa.sa_handler_ = (void *) init_data.segv_handler;
> > + sa.sa_restorer = NULL;
> > + sa.sa_mask = 0L; /* No need to mask anything */
>
> most of that init could be in the initializer, except the dynamic ones
> of course; the NULL/0 is also unnecessary I guess (though might want the
> sa_mask for the comment)
>
> > + struct stub_init_data init_data = {
> > + .stub_start = STUB_START,
> > + .segv_handler = STUB_CODE +
> > + (unsigned long) stub_segv_handler -
> > + (unsigned long) __syscall_stub_start,
> > + };
> > + struct iomem_region *iomem;
> > + int ret;
> > +
> > + init_data.stub_code_fd = phys_mapping(uml_to_phys(__syscall_stub_start),
> > + &offset);
> > + init_data.stub_code_offset = MMAP_OFFSET(offset);
> > +
> > + init_data.stub_data_fd = phys_mapping(uml_to_phys(stack), &offset);
> > + init_data.stub_data_offset = MMAP_OFFSET(offset);
>
> also could move more here into the initializer
>
> > +static int __init init_stub_exe_fd(void)
> > +{
> > + size_t len = 0;
>
> maybe that should be called 'written'?
>
> > + int res;
>
> and I technically that should be ssize_t for the write() return value,
> not that it'll be big enough to matter
>
> > + while (len < stub_exe_end - stub_exe_start) {
> > + res = write(stub_exe_fd, stub_exe_start + len,
> > + stub_exe_end - stub_exe_start - len);
> > + if (res < 0) {
> > + if (errno == EINTR)
> > + continue;
> > +
> > + if (tmpfile)
> > + unlink(tmpfile);
> > + panic("%s: Failed write to memfd: %d", __func__, errno);
>
> nit: not always memfd now
>
> > + if (!tmpfile) {
> > + fcntl(stub_exe_fd, F_ADD_SEALS,
> > + F_SEAL_WRITE | F_SEAL_SHRINK | F_SEAL_GROW | F_SEAL_SEAL);
> > + } else {
> > + /* Only executable by us */
> > + if (fchmod(stub_exe_fd, 00500) < 0) {
>
> now it's also readable, so comment doesn't seem right? maybe just remove
> it?
>
> > + unlink(tmpfile);
> > + panic("Could not make stub binary excutable: %d",
> > + errno);
>
> perhaps print -errno?
>
> > + }
> > +
> > + close(stub_exe_fd);
> > + stub_exe_fd = open(tmpfile, O_RDONLY | O_CLOEXEC | O_NOFOLLOW);
> > + if (stub_exe_fd < 0) {
> > + unlink(tmpfile);
> > + panic("Could not reopen stub binary: %d", errno);
>
> same here
>
> johannes
>
More information about the linux-um
mailing list