Question about ARM function graph tracing

Steven Rostedt rostedt at goodmis.org
Tue Feb 15 21:05:45 EST 2011


On Tue, 2011-02-15 at 17:39 -0800, Tim Bird wrote:
> On 02/04/2011 12:29 PM, Tim Bird wrote:
> >> OMAP is missing a notrace annotation on omap_readl():
> >>
> >> http://www.mail-archive.com/linux-omap@vger.kernel.org/msg38911.html
> >>
> >> Part of that patch was merged through rmk's work, but the
> >> omap_readl() annotation seems to have been missed.
> > Yep.  It's missing in 2.6.38-rc3.
> > 
> >> Also, if this is OMAP1 and not latest mainline, I think you
> >> will need  a notrace on omap_rev() also (called from inside
> >> omap_readl() until recently).
> > 
> > I couldn't find that in the sched_clock code paths, but
> > I 'notrace'd it anyways.
> > 
> >>
> >> I think this is most probably what is wrong, since IIRC I saw
> >> crashes like this on BeagleBoard before I added the notrace on
> >> omap_readl.  Could you please try with these changes?
> > 
> > I have sprinkled 'notrace's liberally throughout the sched_clock
> > code (including omap_readl() and omap_rev()), and I'm still
> > seeing problems.  I put a recursion guard in
> > prepare_ftrace_return, and I'm seeing lots of recursion.  So
> > there's still a notrace missing somewhere.

Note, if you find that you need to add "notrace" to most functions in a
file, you can simply make the entire file "notrace" from the make file:

CFLAGS_REMOVE_myfile.o = -pg


> 
> OK.  The notrace was indeed missing fro omap_readl(), but fixing
> this did not completely fix the problem.  I put the following
> (crude) recursion guard in arch/arm/kernel/ftrace.c:prepare_ftrace_return:
> 
> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
> index c0062ad..5872c25 100644
> --- a/arch/arm/kernel/ftrace.c
> +++ b/arch/arm/kernel/ftrace.c
> @@ -209,6 +209,8 @@ int __init ftrace_dyn_arch_init(void *data)
>  #endif /* CONFIG_DYNAMIC_FTRACE */
> 
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> +static int already_here = 0;
> +
>  void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
>                            unsigned long frame_pointer)
>  {
> @@ -217,8 +219,16 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_a
>         unsigned long old;
>         int err;
> 
> -       if (unlikely(atomic_read(&current->tracing_graph_pause)))
> +       if(unlikely(already_here)) {
>                 return;
> +       } else {
> +               already_here=1;
> +       }

Um, single cpu ?


> +
> +       if (unlikely(atomic_read(&current->tracing_graph_pause))) {
> +               already_here=0;
> +               return;
> +       }
> 
>         old = *parent;
>         *parent = return_hooker;
> @@ -227,6 +237,7 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_ad
>                                        frame_pointer);
>         if (err == -EBUSY) {
>                 *parent = old;
> +               already_here=0;
>                 return;
>         }
> 
> @@ -237,6 +248,7 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_ad
>                 current->curr_ret_stack--;
>                 *parent = old;
>         }
> +       already_here=0;
>  }
> 
>  #ifdef CONFIG_DYNAMIC_FTRACE
> 
> Without this, the system hangs for long periods of time after starting
> the function graph tracer (echo function_graph >/debug/tracing/current_tracer)
> Sometimes the system comes back and sometimes it doesn't.  With the above
> code, the system works, and I can get traces (but obviously I'm dumping
> a bunch of tracepoints whenever the tracer nests.)
> 
> I put some extra instrumentation in, and MOST of the time, the item
> causing nesting is asm_do_IRQ(), which is to be expected.  However,
> I have seen a recursion from preempt_schedule(), which I didn't really
> expect.  Does the tracer invoke anything that would cause a schedule
> from inside the trace-recording code?
> 
> Should I try this with CONFIG_PREEMPT off? (It's currently on.)
> 
> Any ideas for debugging this would be appreciated.

I've had to debug things that broke in the function graph tracer like
this before. Is dynamic ftrace available in arm?

Hmm, from arch/arm/Kconfig:

	select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL)


I guess so. If you are running something other than this XIP_KERNEL, I
would suggest using the dynamic functions to binary search the issue.

cat available_filter_functions > /tmp/funcs

Then copy half the funcs to /tmp/funcs1 and the other half
to /tmp/funcs2

cat /tmp/funcs1 > set_ftrace_filter

This may take a while, as setting the filter one by one is slow.

then run the function graph tracer. If it locks up, try it again
with /tmp/funcs2. If it locks up, you may have two issues or it could be
something wrong with the function graph tracer itself. But I'm betting
one will work and one wont.

Then just keep chopping these funcs files until you get down to a set of
functions that are causing issues. Then it should point out the problem.

This is the procedure I used to find out why a kvm guest was locking up
with function graph tracer.

See commit:

258af47479980d8238a04568b94a4e55aa1cb537

Something tells me you may have a similar issue ;)

-- Steve





More information about the linux-arm-kernel mailing list