CFT: move outer_cache_sync() out of line

Arnd Bergmann arnd at arndb.de
Tue Jan 13 08:34:05 PST 2015


On Monday 12 January 2015 16:36:48 Russell King - ARM Linux wrote:

> Theoretically, this should help overall system performance, since the
> branch predictor should be able to predict this better, but it's entirely
> possible that trying to benchmark a single workload won't be measurably
> different.
> 
> In terms of kernel size figures, this change alone saves almost 17K of
> 10MB of kernel text on my iMX6 kernels - which is bordering on
> insignificant since that's not quite a 0.2% saving.
> 
> So... right now I can't justify this change, but I'm hoping some can come
> up with some figures which shows that it benefits their workload without
> causing a performance regression for others.

>From the theory, I think it can only help to do this. I would guess that
the time spent inside of the cache_sync function dwarfs both the extra
unconditional branch you introduce and the possible misprediction, so
17K in space savings sounds like more than enough justification to just
do it.

Acked-by: Arnd Bergmann <arnd at arndb.de>



More information about the linux-arm-kernel mailing list