[PATCH 1/2] ARM: lib: Add optimized memcpy with 64 byte pld size
Nicolas Pitre
nico at fluxnic.net
Wed Mar 28 01:23:56 EDT 2012
On Wed, 28 Mar 2012, Boojin Kim wrote:
> Nicolas wrote:
>
> > This creates quite convoluted code. If this is worth doing, we'll have
> > to find a cleaner way to do this.
> >
> > Could you please provide performance measurement numbers with and
> > without this patch, and similarly for the next patch?
> >
> > Did you try enabling the cache alignment code? What performance
> > difference if any did you see?
> My patch brings about 10% better result on cache boundary.
> 64bytes PLD size makes the cache efficiency be higher on machines that has 64byte cache line.
> And, Which one is convoluted code? Can you explain it more detail?
Yes, I will. I now have reworked this code to be extensible and still
as clean as possible. I'm not going to post it right away though, given
that it is late and I prefer to have another look at it after I had some
sleep.
Nicolas
More information about the linux-arm-kernel
mailing list