[PATCHv2 1/6] arm64: lib: Implement optimized memcpy routine
Catalin Marinas
catalin.marinas at arm.com
Fri May 9 07:13:09 PDT 2014
On Mon, Apr 28, 2014 at 06:11:29AM +0100, zhichang.yuan at linaro.org wrote:
> This patch, based on Linaro's Cortex Strings library, improves
> the performance of the assembly optimized memcpy() function.
[...]
> --- a/arch/arm64/lib/memcpy.S
> +++ b/arch/arm64/lib/memcpy.S
[...]
> ENTRY(memcpy)
[...]
> + mov dst, dstin
> + cmp count, #16
> + /*When memory length is less than 16, the accessed are not aligned.*/
> + b.lo .Ltiny15
> +
> + neg tmp2, src
> + ands tmp2, tmp2, #15/* Bytes to reach alignment. */
> + b.eq .LSrcAligned
> + sub count, count, tmp2
I started looking at this and comparing it to the original cortex
strings library. Is there any reason why at least the first part has
been rewritten? For example, the cortex strings starts with probably the
most likely case, comparing the count with 64.
--
Catalin
More information about the linux-arm-kernel
mailing list