6.13+ uml crash
Benjamin Berg
benjamin at sipsolutions.net
Tue Apr 7 03:21:01 PDT 2026
Hi,
On Tue, 2026-04-07 at 02:47 -0700, Maciej Żenczykowski wrote:
> On Tue, Apr 7, 2026 at 5:01 PM Benjamin Berg <benjamin at sipsolutions.net> wrote:
> > you get an ESRCH error, which means that either the UML userspace
> > process does not exist or maybe we do not have permission to trace for
> > some reason.
> >
> > Now, the difference is that before the patch UML would just clone to
> > create the userspace processes. After the patch, it will execve() into
> > a separate executable that exists only as a memfd.
> >
> > I am noticing now, that we are doing the PTRACE_TRACEME inside the new
> > executable instead of the usual method of doing it before execve. So
> > maybe that makes a difference with AppArmor.
> >
> > So, I think you can do two things:
> > 1. As a quick workaround, simply use "seccomp=on".
> > However, the sandboxing of the userspace processes is not quite
> > as secure if you do that (i.e. they currently can break out and
> > make host syscalls).
>
> I tried the breaking commit, and 6.13.latest but it doesn't appear to
> support seccomp=on
>
> I tried 6.18.latest but it fails to build for me with
>
> In file included from ./include/linux/unwind_user.h:6,
> from ./include/linux/unwind_deferred.h:6,
> from kernel/fork.c:108:
> ./arch/x86/include/asm/unwind_user.h: In function ‘unwind_user_word_size’:
> ./arch/x86/include/asm/unwind_user.h:23:17: error: ‘struct pt_regs’
> has no member named ‘flags’
> 23 | if (regs->flags & X86_VM_MASK)
> | ^~
> CC net/ipv6/addrconf.o
> ./arch/x86/include/asm/unwind_user.h:23:27: error: ‘X86_VM_MASK’
> undeclared (first use in this function)
> 23 | if (regs->flags & X86_VM_MASK)
> | ^~~~~~~~~~~
> ./arch/x86/include/asm/unwind_user.h:23:27: note: each undeclared
> identifier is reported only once for each function it appears in
> CC block/blk-settings.o
> CC net/core/net_namespace.o
> ./arch/x86/include/asm/unwind_user.h:26:14: error: implicit
> declaration of function ‘user_64bit_mode’
> [-Wimplicit-function-declaration]
> 26 | if (!user_64bit_mode(regs))
> | ^~~~~~~~~~~~~~~
>
> it's likely related to some combination of kconfig options needed for
> android net tests...
>
> 6.16.latest builds but still crashes (not sure if it supports
> seccomp=on), but the crash is different:
The below trace uses seccomp mode. In this case, it looks to me like
the stub process died on us during startup. I am kind of thinking that
in the ptrace case the stub executed but we couldn't trace it (i.e. it
likely managed to do the self kill with SIGSTOP in real_init, see
below).
I guess you'll need to debug it some more.
You could try to add some more logging, to e.g. print the status code
in os_reap_child. That information is missing in the log and may tell
you what failed when starting the stub process.
So, what you are looking at is:
* start_userspace, clone() a new process
* userspace_tramp, sets up the FDs
* userspace_tramp, execve() the memfd
* real_init in stub_exe.c continues and sets up the stub
- sets up required memory maps and FDs
- in seccomp mode, installs SECCOMP filters, etc.
- in ptrace mode, does PTRACE_TRACEME and sends SIGSTOP to itself
Note that you cannot do much in stub_exe.c, which is why it uses
different exit codes to convey the error.
In seccomp mode you should be able to "strace -f" the UML process,
which should show you what is happening to the stub.
Not sure I can help more than this though.
Benjamin
> Run /sbin/net_test.sh as init process
> Unexpectedly lost MM child! Affected tasks will segfault.
> wait_stub_done_seccomp : failed to wait for stub, pid = -1, errno = 0
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> CPU: 0 UID: 0 PID: 1 Comm: net_test.sh Not tainted 6.16.12-g7e13344d2211 #4 NONE
> Stack:
> 80803b60 6063f23b 6051b353 00000000
> 00000001 6063f23b 606314ef 60027dfc
> 80803b90 6002f679 60c70000 609374c0
> Call Trace:
> [<60027dfc>] ? _printk+0x0/0x57
> [<6003237a>] show_stack+0x10b/0x11a
> [<6051b353>] ? dump_stack_print_info+0x0/0x12b
> [<60027dfc>] ? _printk+0x0/0x57
> [<6002f679>] dump_stack_lvl+0x5e/0x79
> [<6002f6ae>] dump_stack+0x1a/0x1c
> [<60026626>] panic+0x156/0x377
> [<6003ff78>] ? unblock_signals+0x25/0x28
> [<6003ffa1>] ? um_set_signals+0x26/0x3e
> [<600264d0>] ? panic+0x0/0x377
> [<60049fd8>] do_exit+0x205/0x9f6
> [<60162f26>] ? kmem_cache_free+0x118/0x12b
> [<6004a9ff>] sys_exit_group+0x0/0x16
> [<60055d9c>] get_signal+0x7dc/0x803
> [<600555c0>] ? get_signal+0x0/0x803
> [<6004349c>] ? setup_signal_stack_si+0x0/0x1f0
> [<60055166>] ? signal_setup_done+0x0/0xb1
> [<600320f5>] do_signal+0x54/0x1ce
> [<60539370>] ? _raw_spin_unlock_irqrestore+0x0/0x39
> [<60032db6>] fatal_sigsegv+0x32/0x3e
> [<60041ac6>] wait_stub_done_seccomp+0x2b1/0x2c0
> [<60041efa>] userspace+0xff/0x694
> [<60031769>] ? interrupt_end+0x0/0xac
> [<60179ba4>] ? copy_strings_kernel+0x0/0x9e
> [<6003166a>] new_thread_handler+0x5a/0x5e
> /aosp-tests/net/test/run_net_test.sh: line 503: 939215 Aborted
> (core dumped) $KERNEL_BINARY umid=net_test mem=512M
> seccomp=on $blockdevice=$ROOTFS $netconfig $consolemode ssl3=null,fd:3
> $cmdline 1>&2 3> "${SSL3}"
> Warning: UML exited with 134 instead of zero.
>
> > 2. Move the ptrace(PTRACE_TRACEME) call into userspace_tramp to
> > check if AppArmor is permitting ptrace then.
>
> haven't yet had time to try this approach.
Benjamin
>
> >
> > Benjamin
> >
> > On Tue, 2026-04-07 at 15:15 +0900, Maciej Żenczykowski wrote:
> > > On Tue, Apr 7, 2026 at 2:56 PM Berg, Johannes <johannes.berg at intel.com> wrote:
> > > > Hi,
> > > >
> > > > Haven't looked at this yet - but really better if you CC the UML list.
> > > >
> > > > johannes
> > > >
> > > > > -----Original Message-----
> > > > > From: Maciej Żenczykowski <maze at google.com>
> > > > > Sent: Saturday, April 4, 2026 5:25 AM
> > > > > To: Berg, Benjamin <benjamin.berg at intel.com>; Berg, Johannes
> > > > > <johannes.berg at intel.com>; Tiwei Bie <tiwei.btw at antgroup.com>
> > > > > Subject: Re: 6.13+ uml crash
> > > > >
> > > > > On Fri, Apr 3, 2026 at 12:34 PM Maciej Żenczykowski <maze at google.com>
> > > > > wrote:
> > > > > >
> > > > > > Host is:
> > > > > >
> > > > > > 6.18.14-1rodete1-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.18.14-
> > > > > 1rodete1
> > > > > > (2026-03-06) x86_64 GNU/Linux
> > > > > >
> > > > > > /proc/cmdline
> > > > > > ima_hash=sha256 kfence.sample_interval=100 intel_iommu=sm_off
> > > > > > pci=noats printk.devkmsg=on slab_nomerge
> > > > > > lsm=landlock,lockdown,yama,loadpin,safesetid,integrity,apparmor,selinu
> > > > > > x,smack,tomoyo,bpf
> > > > > > apparmor=1 panic=30 glinux-boot-image=default-20260324.03.00
> > > > > > earlycon=uart8250,io,0x3f8 console=ttyS0,115200n8 console=hvc0
> > > > > > console=tty0 splash plymouth.ignore-serial-consoles i915.enable_psr=0
> > > > > >
> > > > > > No SELinux, but I think there is likely some corporate AppArmor policy...
> > > > > >
> > > > > > I seem to hit this problem while trying to use UML on latest
> > > > > > 6.13/6.14/6.15/6.16/6.17/6.18/6.19
> > > > > > 6.12 guests are the last ones that work.
> > > > > >
> > > > > > I noticed 6.13 had many UML changes...
> > > > > >
> > > > > > This is while trying to run the android net tests.
> > > > > >
> > > > > > https://android.googlesource.com/kernel/tests/+log/refs/heads/mirror-g
> > > > > > oog-main-kernel /aosp-tests/net/test/run_net_test.sh --builder
> > > > > > all_tests.sh
> > > > > >
> > > > > > ...
> > > > > > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > > > > > devtmpfs: mounted
> > > > > > This architecture does not have kernel memory protection.
> > > > > > Run /sbin/net_test.sh as init process
> > > > > > Registers -
> > > > > > 0 0x60025a90
> > > > > > 1 0x2e2ea5
> > > > > > 2 0x600423e4
> > > > > > 3 0x2e2ea8
> > > > > > 4 0x7fffffffbe30
> > > > > > 5 0x0
> > > > > > 6 0x206
> > > > > > 7 0x0
> > > > > > 8 0x7fffffffbe30
> > > > > > 9 0xffffffff
> > > > > > 10 0x0
> > > > > > 11 0x7ffff7c40c87
> > > > > > 12 0x0
> > > > > > 13 0x13
> > > > > > 14 0x2e2ea8
> > > > > > 15 0x3e
> > > > > > 16 0x7fffffffc000
> > > > > > 17 0x33
> > > > > > 18 0x206
> > > > > > 19 0x7fffffffeff8
> > > > > > 20 0x2b
> > > > > > 21 0x7ffff7f9d740
> > > > > > 22 0x0
> > > > > > 23 0x0
> > > > > > 24 0x0
> > > > > > 25 0x0
> > > > > > 26 0x0
> > > > > > Kernel panic - not syncing: do_syscall_stub : PTRACE_SETREGS failed,
> > > > > > errno = 3 [ESRCH]
> > > > > > CPU: 0 UID: 0 PID: 1 Comm: net_test.sh Not tainted
> > > > > > 6.13.12-g2abfa5d47651 #3
> > > > > > Stack:
> > > > > > 600277e4 60629dd1 80803d80 00000000
> > > > > > 00000001 60629dd1 6061c0a9 600277e4
> > > > > > 80803db0 6002efd2 60c70000 609171c0
> > > > > > Call Trace:
> > > > > > [<600277e4>] ? _printk+0x0/0x57
> > > > > > [<600322ec>] show_stack+0x10b/0x11a
> > > > > > [<600277e4>] ? _printk+0x0/0x57
> > > > > > [<600277e4>] ? _printk+0x0/0x57
> > > > > > [<6002efd2>] dump_stack_lvl+0x5e/0x79 [<6002f007>]
> > > > > > dump_stack+0x1a/0x1c [<60026838>] panic+0x156/0x377 [<600277e4>] ?
> > > > > > _printk+0x0/0x57 [<600266e2>] ? panic+0x0/0x377 [<600277e4>] ?
> > > > > > _printk+0x0/0x57 [<6004555b>] do_syscall_stub+0xa6/0x132
> > > > > > [<60031608>] ? interrupt_end+0x0/0xae [<6004560b>]
> > > > > > syscall_stub_flush+0x24/0x2e [<60045daf>] userspace+0x81/0x469
> > > > > > [<60031608>] ? interrupt_end+0x0/0xae [<601761ba>] ?
> > > > > > copy_strings_kernel+0x0/0x9e [<60031509>]
> > > > > > new_thread_handler+0x5a/0x5e
> > > > > > /aosp-tests/net/test/run_net_test.sh: line 498: 3026597 Aborted
> > > > > > (core dumped) $KERNEL_BINARY umid=net_test mem=512M
> > > > > > $blockdevice=$ROOTFS $netconfig $consolemode ssl3=null,fd:3 $cmdline
> > > > > > 1>&2 3> "${SSL3}"
> > > > > > Warning: UML exited with 134 instead of zero.
> > > > >
> > > > > It appears this broke at:
> > > > >
> > > > > commit 32e8eaf263d9be014ba1970444f745682fa9c6c0
> > > > > Author: Benjamin Berg <benjamin.berg at intel.com>
> > > > > um: use execveat to create userspace MMs
> > > > Intel Deutschland GmbH
> > > > Registered Address: Dornacher Strasse 1, 85622 Feldkirchen, Germany
> > > > Tel: +49 89 991 430, www.intel.de
> > > > Managing Directors: Harry Demas, Jeffrey Schneiderman, Yin Chong Sorrell
> > > > Chairperson of the Supervisory Board: Nicole Lau
> > > > Registered Seat: Munich
> > > > Commercial Register: Amtsgericht Muenchen HRB 186928
> > >
> > > --
> > > Maciej Żenczykowski, Kernel Networking Developer @ Google
>
> --
> Maciej Żenczykowski, Kernel Networking Developer @ Google
More information about the linux-um
mailing list