[RFC PATCH v2 10/13] x86/um: nommu: signal handling
Benjamin Berg
benjamin at sipsolutions.net
Thu Nov 28 02:37:21 PST 2024
Hi,
On Mon, 2024-11-11 at 15:27 +0900, Hajime Tazaki wrote:
> This commit updates the behavior of signal handling under !MMU
> environment. 1) the stack preparation for the signal handlers and
> 2) restoration of stack after rt_sigreturn(2) syscall. Those are needed
> as the stack usage on vfork(2) syscall is different.
>
> It also adds the follow up routine for SIGSEGV as a signal delivery runs
> in the same stack frame while we have to avoid endless SIGSEGV.
>
> Signed-off-by: Hajime Tazaki <thehajime at gmail.com>
> ---
> arch/um/include/shared/kern_util.h | 3 +++
> arch/um/kernel/trap.c | 10 ++++++++
> arch/um/os-Linux/signal.c | 18 ++++++++++++++-
> arch/x86/um/signal.c | 37 +++++++++++++++++++++++++++++-
> 4 files changed, 66 insertions(+), 2 deletions(-)
>
> [SNIP]
> diff --git a/arch/um/os-Linux/signal.c b/arch/um/os-Linux/signal.c
> index 52852018a3ad..a06622415d8f 100644
> --- a/arch/um/os-Linux/signal.c
> +++ b/arch/um/os-Linux/signal.c
> @@ -36,7 +36,15 @@ static void sig_handler_common(int sig, struct siginfo *si, mcontext_t *mc)
> struct uml_pt_regs r;
> int save_errno = errno;
>
> - r.is_user = 0;
> +#ifndef CONFIG_MMU
> + memset(&r, 0, sizeof(r));
> + /* mark is_user=1 when the IP is from userspace code. */
> + if (mc && (REGS_IP(mc->gregs) > uml_reserved
> + && REGS_IP(mc->gregs) < high_physmem))
> + r.is_user = 1;
> + else
> +#endif
> + r.is_user = 0;
Does this work if we load modules dynamically?
I suppose one could map them into a separate memory area rather than
running them directly from the physical memory.
Otherwise we'll also get problem with the SECCOMP filter.
> if (sig == SIGSEGV) {
> /* For segfaults, we want the data from the sigcontext. */
> get_regs_from_mc(&r, mc);
> @@ -191,6 +199,7 @@ static void hard_handler(int sig, siginfo_t *si, void *p)
> ucontext_t *uc = p;
> mcontext_t *mc = &uc->uc_mcontext;
> unsigned long pending = 1UL << sig;
> + int is_segv = 0;
>
> do {
> int nested, bail;
> @@ -214,6 +223,7 @@ static void hard_handler(int sig, siginfo_t *si, void *p)
>
> while ((sig = ffs(pending)) != 0){
> sig--;
> + is_segv = (sig == SIGSEGV) ? 1 : 0;
> pending &= ~(1 << sig);
> (*handlers[sig])(sig, (struct siginfo *)si, mc);
> }
> @@ -227,6 +237,12 @@ static void hard_handler(int sig, siginfo_t *si, void *p)
> if (!nested)
> pending = from_irq_stack(nested);
> } while (pending);
> +
> +#ifndef CONFIG_MMU
> + /* if there is SIGSEGV notified, let the userspace run w/ __noreturn */
> + if (is_segv)
> + sigsegv_post_routine();
> +#endif
> }
I am confused, this doesn't feel quite correct to me.
So, for normal UML, I think we always do an rt_sigreturn. Which means,
we always go back to the corresponding *kernel* task. To schedule in
response to SIGALRM, we forward the signal to the userspace process.
I believe that means:
1. We cannot schedule kernel threads (that seems like a bug)
2. Scheduling for userspace happens once the signal is delivered.
Then userspace() saves the state and calls interrupt_end().
Now, keep in mind that we are on the separate signal stack here. If we
jump anywhere directly, we abandon the old state information stored by
the host kernel into the mcontext. We can absolutely do that, but we
need to be careful to not forget anything.
As such, I wonder whether nommu should:
1. When entering from kernel, update "current->thread.switch_buf"
from the mcontext.
- If we need to schedule, push a stack frame that calls the scheduling
code and returns with the correct state.
2. When entering from user, store the task registers from the
mcontext. At some point (here or earlier) ensure that the
"current->thread.switch_buf" is set up so that we can return to
userspace by restoring the task registers.
- To schedule, piggy back on 1. or add special code.
3. Always do a UML_LONGJMP() back into the "current" task.
That said, I am probably not having the full picture right now.
Benjamin
PS: On a further note, I think the current code to enter userspace
cannot handle single stepping. I suppose that is fine, but you should
probably set arch_has_single_step to 0 for nommu.
> void set_handler(int sig)
> diff --git a/arch/x86/um/signal.c b/arch/x86/um/signal.c
> index 75087e85b6fd..b7365c75a967 100644
> --- a/arch/x86/um/signal.c
> +++ b/arch/x86/um/signal.c
> @@ -371,6 +371,13 @@ int setup_signal_stack_si(unsigned long stack_top, struct ksignal *ksig,
> round_down(stack_top - sizeof(struct rt_sigframe), 16);
>
> /* Add required space for math frame */
> +#ifndef CONFIG_MMU
> + /*
> + * the sig_frame on !MMU needs be aligned for SSE as
> + * the frame is used as-is.
> + */
> + math_size = round_down(math_size, 16);
> +#endif
> frame = (struct rt_sigframe __user *)((unsigned long)frame - math_size);
>
> /* Subtract 128 for a red zone and 8 for proper alignment */
> @@ -417,6 +424,18 @@ int setup_signal_stack_si(unsigned long stack_top, struct ksignal *ksig,
> /* could use a vstub here */
> return err;
>
> +#ifndef CONFIG_MMU
> + /*
> + * we need to push handler address at top of stack, as
> + * __kernel_vsyscall, called after this returns with ret with
> + * stack contents, thus push the handler here.
> + */
> + frame = (struct rt_sigframe __user *) ((unsigned long) frame -
> + sizeof(unsigned long));
> + err |= __put_user((unsigned long)ksig->ka.sa.sa_handler,
> + (unsigned long *)frame);
> +#endif
> +
> if (err)
> return err;
>
> @@ -442,9 +461,25 @@ SYSCALL_DEFINE0(rt_sigreturn)
> unsigned long sp = PT_REGS_SP(¤t->thread.regs);
> struct rt_sigframe __user *frame =
> (struct rt_sigframe __user *)(sp - sizeof(long));
> - struct ucontext __user *uc = &frame->uc;
> + struct ucontext __user *uc;
> sigset_t set;
>
> +#ifndef CONFIG_MMU
> + /**
> + * we enter here with:
> + *
> + * __restore_rt:
> + * mov $15, %rax
> + * call *%rax (translated from syscall)
> + *
> + * (code is from musl libc)
> + * so, stack needs to be popped of "call"ed address before
> + * looking at rt_sigframe.
> + */
> + frame = (struct rt_sigframe __user *)((unsigned long)frame + sizeof(long));
> +#endif
> + uc = &frame->uc;
> +
> if (copy_from_user(&set, &uc->uc_sigmask, sizeof(set)))
> goto segfault;
>
More information about the linux-um
mailing list