[PATCH v2 2/2] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions
Måns Rullgård
mans at mansr.com
Wed Nov 25 18:19:48 PST 2015
Russell King - ARM Linux <linux at arm.linux.org.uk> writes:
> On Thu, Nov 26, 2015 at 12:50:08AM +0000, Måns Rullgård wrote:
>> If not calling the function saves an I-cache miss, the benefit can be
>> substantial. No, I have no proof of this being a problem, but it's
>> something that could happen.
>
> That's a simplistic view of modern CPUs.
>
> As I've already said, modern CPUs which have branch prediction, but
> they also have speculative instruction fetching and speculative data
> prefetching - which the CPUs which have idiv support will have.
>
> With such features, the branch predictor is able to learn that the
> branch will be taken, and because of the speculative instruction
> fetching, it can bring the cache line in so that it has the
> instructions it needs with minimal or, if working correctly,
> without stalling the CPU pipeline.
It doesn't matter how many fancy features the CPU has. Executing more
branches and using more cache lines puts additional pressure on those
resources, reducing overall performance. Besides, the performance
counters readily show that the prediction is nothing near as perfect as
you seem to believe.
--
Måns Rullgård
mans at mansr.com
More information about the linux-arm-kernel
mailing list