Call for testing/opinions: Optimized memset/memcpy

Harm Hanemaaijer fgenfb at yahoo.com
Sun Jul 14 07:19:27 EDT 2013


Dr. David Alan Gilbert <gilbertd <at> treblig.org> writes:
> 
> Maybe neon is worth a try these days (although be careful of platforms
> like Tegra 2 that doens't have it); there was a recent patch that enabled
> use in the kernel (I think for some RAID use). The downside is it's
> supposed to be quite power hungry.
> 

As it turns out, NEON isn't too hard to implement. I have added NEON support
to copy_page, memset, memzero, and memcpy (both for the aligned and unaligned
case) in my userspace testing environment. It gives a nice boost (ranging
from 10% for copy_page to >30% for unaligned memcpy on a Cortex A8), which
can potentially be more on other cores. Although I have not tested a live
kernel yet, it looks like NEON can be used fairly transparently #ifdefed on
the CONFIG_NEON kernel definition as long as only the lower end of the
NEON/vfp register file is clobbered (although this needs verification).





More information about the linux-arm-kernel mailing list