[PATCH] ARM: fix unwinding for XIP kernels
Uwe Kleine-König
u.kleine-koenig at pengutronix.de
Sun Nov 20 17:52:07 EST 2011
On Sun, Nov 20, 2011 at 12:28:09PM +0100, Uwe Kleine-König wrote:
> On Thu, Nov 17, 2011 at 02:17:06PM +0000, Catalin Marinas wrote:
> > On Thu, Nov 17, 2011 at 01:40:00PM +0000, Uwe Kleine-König wrote:
> > > The linker places the unwind tables in readonly sections. So when using
> > > an XIP kernel these are located in ROM and cannot be modified.
> > >
> > > For that reason don't convert the symbol addresses during boot (or
> > > module loading) but only when interpreting them in search_index().
> > > Moreover several consts are added to catch future writes and rename the
> > > member "addr" of struct unwind_idx to "addr_offset" to better match the
> > > new semantic.
> > >
> > > This fixes unwinding on XIP which compared prel31 offsets to absolute
> > > addresses because the initial conversion from prel31 to absolute failed.
> >
> > My only worry - does this increase the index search by doing the prel31
> > conversion every time? It could affect tools like lockdep that need to
> > get the backtrace regularly at run-time.
> I did a first test now using
>
> static int __init unwind_test(void)
> {
> unsigned long flags;
> u64 start, end;
> register unsigned long current_sp asm ("sp");
> int i;
>
> struct stackframe init_frame;
>
> init_frame.fp = (unsigned long)__builtin_frame_address(0);
> init_frame.sp = current_sp;
> init_frame.lr = (unsigned long)__builtin_return_address(0);
> init_frame.pc = (unsigned long)unwind_test;
>
> local_irq_save(flags);
> start = timestamp();
> for (i = 0; i < 100; ++i) {
> struct stackframe frame = init_frame;
> while (!unwind_frame(&frame));
> }
> end = timestamp();
> local_irq_restore(flags);
>
> pr_info("%s: ************************ unwind test took %llu\n",
> __func__, (unsigned long long)(end - start));
> return 0;
> }
> late_initcall(unwind_test);
>
> where timestamp reads and returns the value of a cpu counter on an mx35
> machine.
>
> The increase in runtime of my patch is at approx 7% for the above test
> case.
>
> I will try later to optimise a bit more as I wrote earlier in this
> thread.
OK, did that now, and it's a tad faster than the original implementation
with an optimisation[1] using my test case. Don't know why though. Maybe
because I moved the test for the first entry to after the loop and the
cache helps me there?
I'll send a patch in reply to this mail that applies on top of my
previous patch. It would be great to get someone proof reading it.
Best regards
Uwe
[1] I optimised as follows:
--- a/arch/arm/kernel/unwind.c
+++ b/arch/arm/kernel/unwind.c
@@ -110,14 +110,13 @@ static struct unwind_idx *search_index(unsigned long addr,
if (addr < first->addr) {
pr_warning("unwind: Unknown symbol address %08lx\n", addr);
return NULL;
- } else if (addr >= last->addr)
- return last;
+ }
- while (first < last - 1) {
+ while (first < last) {
struct unwind_idx *mid = first + ((last - first + 1) >> 1);
if (addr < mid->addr)
- last = mid;
+ last = mid - 1;
else
first = mid;
}
--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | http://www.pengutronix.de/ |
More information about the linux-arm-kernel
mailing list