[PATCH] arm64: Add user stacktrace support

Mark Rutland mark.rutland at arm.com
Fri Nov 24 02:12:18 PST 2023


On Fri, Nov 24, 2023 at 05:55:27PM +0800, chenqiwu wrote:
> On Tue, Nov 21, 2023 at 02:05:29PM +0000, Mark Rutland wrote:
> > On Tue, Nov 21, 2023 at 05:03:00PM +0800, chenqiwu wrote:
> > > On Mon, Nov 20, 2023 at 02:29:17PM +0000, Mark Rutland wrote:
> > > > On Sat, Nov 18, 2023 at 09:45:04PM +0800, qiwuchen55 at gmail.com wrote:
> > > > > From: chenqiwu <qiwu.chen at transsion.com>
> > > > > 
> > > > > 1. Introduce and export arch_dump_user_stacktrace() API to support
> > > > > user stacktrace dump for a user task (both current and non-current task).
> > > > > A example test about the log format of user stacktrace as shown below:
> > > > > [test-515] Dump user backtrace:
> > > > > <0xffffb0c1a750> in /lib/aarch64-linux-gnu/libc-2.32.so[ffffb0b53000-ffffb0cb1000]
> > > > > <0xaaaacbf8097c> in /mnt/test[aaaacbf80000-aaaacbf81000]
> > > > > <0xffffb0b778b8> in /lib/aarch64-linux-gnu/libc-2.32.so[ffffb0b53000-ffffb0cb1000]
> > > > > <0xaaaacbf80834> in /mnt/test[aaaacbf80000-aaaacbf81000]
> > > > 
> > > > Where is this used?
> > > > 
> > > It's used in kernel space for some case need user backtrace for debugging,
> > > such as:  https://lkml.org/lkml/2023/11/9/1365
> > 
> > That's not in mainline.
> >
> I konw, you may misslead my intention for introducing the arch_dump_user_stacktrace API.
> For this case only dump panic kernel trace any user regs and backtrace info on global
> init exit. If arch_dump_user_stacktrace API can be merged into kernel-tree firstly, we
> can use the API directly to dump user backtrace on global init exit, which will be helpful
> to find the global init exit reason.

I don't know what you mean by "you may misslead my intention" here. This patch
didn't add code to call the backtrace on global init exit, nor did the commit
message say anything about that.

Regardless, that doesn't change the fact that there's no caller in mainline nor
any caller added in this patch. NAK for adding that without a user.

> BTW, I think there are two significant places for the usage of arch_dump_user_stacktrace:
> 1) we can dump any important user thread backtrace without log ratelimit in arm64_show_signal(),
> e.g: global_init thread
> +++ b/arch/arm64/kernel/traps.c
> @@ -247,12 +247,14 @@ static void arm64_show_signal(int signo, const char *str)
>         unsigned long esr = tsk->thread.fault_code;
>         struct pt_regs *regs = task_pt_regs(tsk);
> 
> +       if (unlikely(is_global_init(tsk)))
> +               goto dump;
>         /* Leave if the signal won't be shown */
>         if (!show_unhandled_signals ||
>             !unhandled_signal(tsk, signo) ||
>             !__ratelimit(&rs))
>                 return;
> -
> +dump:
>         pr_info("%s[%d]: unhandled exception: ", tsk->comm, task_pid_nr(tsk));
>         if (esr)
>                 pr_cont("%s, ESR 0x%016lx, ", esr_get_class_string(esr), esr);
> @@ -261,6 +263,7 @@ static void arm64_show_signal(int signo, const char *str)
>         print_vma_addr(KERN_CONT " in ", regs->pc);
>         pr_cont("\n");
>         __show_regs(regs);
> +       arch_dump_user_stacktrace(regs);
>  }

Huh? So for any signal directed to the global init task we'll print something
out, even if that signal will be handled?

That is *not* a good idea, and I don't think that's something we want generally
(though I appreciate it may help to debug some issues).

> 2) we can dump the user thread's backtrace which blocked in D state in hung_task,
> in this way, we can get more clues to find the user thread blocked reason.
> +++ b/kernel/hung_task.c
> @@ -137,6 +137,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
>                         init_utsname()->version);
>                 pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
>                         " disables this message.\n");
> +               if (user_mode(regs))
> +                       arch_dump_user_stacktrace(task_pt_regs(t));
>                 sched_show_task(t);

Maybe, but I don't think that's actually necessary.

> > > > >  arch/arm64/Kconfig             |   1 +
> > > > >  arch/arm64/kernel/stacktrace.c | 208 +++++++++++++++++++++++++++++++++
> > > > >  include/linux/stacktrace.h     |  10 ++
> > > > >  3 files changed, 219 insertions(+)
> > > > 
> > > > As above, we already have user stacktracing code, and we shouldn't add
> > > > *distinct* unwinders. Either that code should be factored out and reused, or
> > > > this code should replace it.
> > > > 
> > > Currently, ARM64 platform is not supported for ftrace userstacktrace profile feature,
> > > since CONFIG_USER_STACKTRACE_SUPPORT is not enabled, the call chain cannot be accessed:
> > > ftrace_trace_userstack -> stack_trace_save_user -> arch_stack_walk_user
> > 
> > As above, please reuse the existing code, e.g. take the existing logic in
> > perf_callchain_user() and use it to implement arch_stack_walk_user(), and call
> > that from perf_callchain_user() an stack_trace_save_user().
> > 
> Okay, I will develop the stack_trace_save_user() refers to perf_callchain_user() and
> resend this patch as V2.

Thanks!

Mark.



More information about the linux-arm-kernel mailing list