Call for testing/opinions: Optimized memset/memcpy
catalin.marinas at arm.com
Mon Jul 15 09:15:20 EDT 2013
On Sat, Jul 13, 2013 at 10:13:12PM +0100, Harm Hanemaaijer wrote:
> Dr. David Alan Gilbert <gilbertd <at> treblig.org> writes:
> > You might like to compare with some of the routines at:
> > https://launchpad.net/cortex-strings
> > and some of the numbers at:
> > https://wiki.linaro.org/WorkingGroups/ToolChain/Benchmarks/
> That's interesting. I had looked at cortex-strings before but didn't
> dig into it, also because its benchmark program seemed to be limited in
> scope. From the Linaro numbers it seems NEON isn't always a win
> especially on newer Cortex platforms, with large variability across
> different platforms/cores.
As it has been stated in this thread, we shouldn't use Neon for memcpy.
There is a significant overhead with saving/restoring Neon registers,
But Cortex Strings is a good starting point and Linaro is going to port
some of these functions to the Linux kernel for ARMv8 (AArch64).
More information about the linux-arm-kernel