[PATCH v2 2/2] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions

Måns Rullgård mans at mansr.com
Wed Nov 25 16:50:08 PST 2015

Nicolas Pitre <nico at fluxnic.net> writes:

> On Thu, 26 Nov 2015, Måns Rullgård wrote:
>> Nicolas Pitre <nico at fluxnic.net> writes:
>> > 3) In fact I was wondering if the overhead of the branch and back is 
>> >    really significant compared to the non trivial cost of a idiv 
>> >    instruction and all the complex infrastructure required to patch 
>> >    those branches directly, and consequently if the performance 
>> >    difference is actually worth it versus simply doing (2) alone.
>> Depending on the operands, the div instruction can take as few as 3
>> cycles on a Cortex-A7.
> Even the current software based implementation can produce a result with 
> about 5 simple ALU instructions depending on the operands.
> The average cycle count is more important than the easy-way-out case. 
> And then how significant the two branches around it are compared to idiv 
> alone from direct patching of every call to it.

If not calling the function saves an I-cache miss, the benefit can be
substantial.  No, I have no proof of this being a problem, but it's
something that could happen.

Of course, none of this is going to be as good as letting the compiler
generate div instructions directly.

Måns Rullgård
mans at mansr.com

More information about the linux-arm-kernel mailing list