[PATCH v6 3/6] arm64: ftrace: fix a stack tracer's output under function graph tracer
Jungseok Lee
jungseoklee85 at gmail.com
Thu Nov 26 06:05:02 PST 2015
On Nov 26, 2015, at 12:05 PM, AKASHI Takahiro wrote:
> Jungseok,
Hi Akashi,
> On 11/25/2015 08:48 PM, Jungseok Lee wrote:
>> On Nov 25, 2015, at 2:29 PM, AKASHI Takahiro wrote:
>>> On 11/24/2015 10:37 PM, Jungseok Lee wrote:
>>>> On Nov 18, 2015, at 3:43 PM, AKASHI Takahiro wrote:
>>>>
>>>> Hi Akashi,
>>>>
>>>>> Function graph tracer modifies a return address (LR) in a stack frame
>>>>> to hook a function return. This will result in many useless entries
>>>>> (return_to_handler) showing up in
>>>>> a) a stack tracer's output
>>>>> b) perf call graph (with perf record -g)
>>>>> c) dump_backtrace (at panic et al.)
>>>>>
>>>>> For example, in case of a),
>>>>> $ echo function_graph > /sys/kernel/debug/tracing/current_tracer
>>>>> $ echo 1 > /proc/sys/kernel/stack_trace_enabled
>>>>> $ cat /sys/kernel/debug/tracing/stack_trace
>>>>> Depth Size Location (54 entries)
>>>>> ----- ---- --------
>>>>> 0) 4504 16 gic_raise_softirq+0x28/0x150
>>>>> 1) 4488 80 smp_cross_call+0x38/0xb8
>>>>> 2) 4408 48 return_to_handler+0x0/0x40
>>>>> 3) 4360 32 return_to_handler+0x0/0x40
>>>>> ...
>>>>>
>>>>> In case of b),
>>>>> $ echo function_graph > /sys/kernel/debug/tracing/current_tracer
>>>>> $ perf record -e mem:XXX:x -ag -- sleep 10
>>>>> $ perf report
>>>>> ...
>>>>> | | |--0.22%-- 0x550f8
>>>>> | | | 0x10888
>>>>> | | | el0_svc_naked
>>>>> | | | sys_openat
>>>>> | | | return_to_handler
>>>>> | | | return_to_handler
>>>>> ...
>>>>>
>>>>> In case of c),
>>>>> $ echo function_graph > /sys/kernel/debug/tracing/current_tracer
>>>>> $ echo c > /proc/sysrq-trigger
>>>>> ...
>>>>> Call trace:
>>>>> [<ffffffc00044d3ac>] sysrq_handle_crash+0x24/0x30
>>>>> [<ffffffc000092250>] return_to_handler+0x0/0x40
>>>>> [<ffffffc000092250>] return_to_handler+0x0/0x40
>>>>> ...
>>>>>
>>>>> This patch replaces such entries with real addresses preserved in
>>>>> current->ret_stack[] at unwind_frame(). This way, we can cover all
>>>>> the cases.
>>>>
>>>> I've observed a strange behavior when playing with case c). Call trace
>>>> is as follows when function_graph is not used.
>>>>
>>>> Call trace:
>>>> [<fffffe00003dc738>] sysrq_handle_crash+0x24/0x30 <- (1)
>>>> [<fffffe00003dd2ac>] __handle_sysrq+0x128/0x19c <- (2)
>>>> [<fffffe00003dd730>] write_sysrq_trigger+0x60/0x74
>>>> [<fffffe0000249fc4>] proc_reg_write+0x84/0xc0
>>>> [<fffffe00001f2638>] __vfs_write+0x44/0x104 <- (3)
>>>> [<fffffe00001f2e60>] vfs_write+0x98/0x1a8
>>>> [<fffffe00001f3730>] SyS_write+0x50/0xb0
>>>> [<fffffe00000939ec>] el0_svc_naked+0x20/0x28 <- (4)
>>>>
>>>> When function_graph is set, some entries, such as do_mem_abort, are added
>>>> between (1) and (2). In addition, entries from (3) to (4) are not printed
>>>> out. As tracking down elements of ret_stack[], I realize dump_backtrace()
>>>> has been terminated before reaching to ret_stack[0]. Have you seen this
>>>> kind of behavior? I believe push & pop operations work correctly.
>>>>
>>>> Please correct me if I'm wrong.
>>>
>>> Oops, I mis-interpreted the result output.
>>> You are right.
>>> This can happen because the original dump_backtrace() wants to trace a stack
>>> from a function where an exception has taken place, sysrq_handle_crash(), while
>>> ret_stack[curr_ret_stack] doesn't point to that function, but the very top of
>>> traced functions in callchains, that is, __do_kernel_fault in your case, probably.
>>
>> Yes, __do_kernel_fault.
>>
>>> So it results in replacing entries of return_to_handler to wrong function addresses.
>>>
>>> A fixup! patch attached below (informative only) fixes this issue by once tracing
>>> all the functions on a stack, but preventing a top few ones from being printed.
>>> But there is a tricky thing here: we have to use 'regs->pc' instead of frame.pc
>>> as a trapped function because, as I've already mentioned before, we always miss
>>> the function when walking through a stack from an exception handler to functions
>>> in a thread context.
>>> (Please note that we will introduce a dummy stack frame at interrupt, but not
>>> at exception.)
>>
>> Thanks for clear explanation!
>>
>> Since I'm not the only person to experience the above case, I report one more and
>> the last observation ;)
>>
>> PC and LR information is printed out when function_graph is set.
>>
>> PC is at sysrq_handle_crash+0x24/0x30
>> LR is at sysrq_handle_crash+0x10/0x30
>>
>> The logs are as follows when function_graph turns off.
>>
>> PC is at sysrq_handle_crash+0x24/0x30
>> LR is at __handle_sysrq+0x128/0x19c
>>
>> I think __show_regs() might have a similar problem when retrieving LR according to
>> the below informative patch. Thoughts?
>
> Well, I believe that it is normal.
> Sysrq_handle_crash() is basically a leaf function, but once function or function graph
> tracer is turned on, ftrace_caller() is called before its function body and so LR has
> been modified when panic/show_regs().
Got it. All aspects of this patch is clear now. Thanks!
Best Regards
Jungseok Lee
More information about the linux-arm-kernel
mailing list