[PATCH] ARM: Runtime patch udiv/sdiv instructions into __aeabi_{u}idiv()
Stephen Boyd
sboyd at codeaurora.org
Thu Jan 7 18:44:29 PST 2016
On 01/04, Nicolas Pitre wrote:
> On Mon, 4 Jan 2016, Stephen Boyd wrote:
> >
> > I can update the patches to be based on this patch here and
> > handle the conditional branches and tail call optimization cases
> > by adding some safety checks like we have for the ftrace branch
> > patching. But I'd rather not do that work unless we all agree
> > that it's worthwhile pursuing it.
> >
> > Is there still any concern about the benefit of patching each
> > call site vs. patching the functions? The micro benchmark seems
> > to show some theoretical improvement on cortex-a7 and I can run
> > it on Scorpion and Krait processors to look for any potential
> > benefits there, but I'm not sure of any good kernel benchmark for
> > this. If it will be rejected due to complexity vs. benefit
> > arguments I'd rather work on something else.
>
> You could run the benchmark on Scorpion and Krait to start with. If
> there is no improvement what so ever like on A15's then the answer might
> be rather simple.
>
So running the benchmark on Scorpion is not useful because we
don't have the idiv instruction there. On Krait I get the
following results. I ran this on a dragonboard apq8074 with
maxcpus=1 on the kernel command line.
Testing INLINE_DIV ...
real 0m 13.56s
user 0m 13.56s
sys 0m 0.00s
Testing PATCHED_DIV ...
real 0m 15.15s
user 0m 15.14s
sys 0m 0.00s
Testing OUTOFLINE_DIV ...
real 0m 18.09s
user 0m 18.09s
sys 0m 0.00s
Testing LIBGCC_DIV ...
real 0m 24.26s
user 0m 24.25s
sys 0m 0.00s
It looks like the branch actually costs us some time here.
Patching isn't as good as the compiler inserting the instruction
itself, but it is better than branching to the division routine.
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
More information about the linux-arm-kernel
mailing list