ftrace performance impact with different configuration
Philippe Rétornaz
philippe.retornaz at epfl.ch
Fri Dec 30 08:07:11 EST 2011
Le jeudi 29 décembre 2011 11:21:25 Steven Rostedt a écrit :
> On Thu, 2011-12-29 at 21:12 +0530, Rabin Vincent wrote:
> > On Thu, Dec 29, 2011 at 14:08, Lei Wen <adrian.wenl at gmail.com> wrote:
> > > 2. Seem dynamic ftrace also could involve some penalty for the
> > > running
> > > system, although it patching the running kernel with nop stub...
> > >
> > > For the second item, is there anyone done some research before that
> > > could zero the cost for the running system when the tracing is not
> > > enabled yet?
> >
> > One thing that needs to be fixed (for ARM) is that for the new-style
> > mcounts, the nop that's currently being done is not really a nop -- it
> > removes the function call, but there is still an unnecessary push/pop
> > sequence. This should be modified to have the push {lr} removed too.
> > (Two instructions replaced instead of one.)
>
> Unfortunately you can't do this, at least not when the kernel is
> preemptible.
>
> Say we have:
>
> push lr
> call mcount
>
> then we convert it to:
>
> nop
> nop
>
> The conversion to nop should not be an issue, and this is what would be
> done when the system boots up. But then we enable tracing, some low
> priority task could have been preempted after executing the first nop,
> and we call stop machine to do the conversions (if no stop machine, then
> lets just say a higher prio task is running while we do the
> conversions). Then we add both the push lr and call back. But when that
> lower priority task gets scheduled in again, it would have looked like
> it ran:
>
> nop
> call mcount
>
> Since the call to mcount requires that the lr was pushed, this process
> will crash when the return is done and we never saved the lr.
>
> If you don't like the push. the best thing you can do is convert to:
>
> jmp 1f
> call mcount
> 1:
>
> This may not be as cheap as two nops, but it may be better than a push.
Sorry about being a bit naive, but why it is not possible to do it in two
steps ?
call stop_machine to put the jmp which skip the call to mcount
Then wait until all tasks hits schedule() (synchronize_sched() ?)
Then modify both instructions to put in place the two nops since we know that
nobody is calling mcount.
Thanks,
Philippe
More information about the linux-arm-kernel
mailing list