Call for testing/opinions: Optimized memset/memcpy
Harm Hanemaaijer
fgenfb at yahoo.com
Sun Jul 14 09:33:14 EDT 2013
Ard Biesheuvel <ard.biesheuvel <at> linaro.org> writes:
>
> You will clobber the userland NEON contents of the register file if
> you don't preserve them properly. Also, kernel preemption (if enabled)
> may put your task to sleep at any time, and the context switching
> machinery is totally oblivious of NEON being used in the kernel, so
> the kernel side will get corrupted as well in this case.
>
> I have a patch series pending (i.e., accepted but not pulled yet by
> Russell) which addresses these issues.
>
That was what I was afraid of concerning NEON. It must be tricky to solve
without sacrificing performance, since saving/restoring the entire NEON
register file would obviously seriously impact context switch performance.
For memcpy-like applications, basically only four dword registers are
required (d0-d3) which could possibly be optimized for.
More information about the linux-arm-kernel
mailing list