[RFC v2 PATCH 2/4] ARM64: add support for kernel mode NEON in atomic context
Catalin Marinas
catalin.marinas at arm.com
Fri Oct 11 13:14:46 EDT 2013
On Wed, Oct 09, 2013 at 07:50:32PM +0100, Ard Biesheuvel wrote:
> --- a/arch/arm64/include/asm/neon.h
> +++ b/arch/arm64/include/asm/neon.h
> @@ -8,7 +8,38 @@
> * published by the Free Software Foundation.
> */
>
> +#include <linux/hardirq.h>
> +#include <linux/types.h>
> +#include <asm/fpsimd.h>
> +
> #define cpu_has_neon() (1)
>
> +#define DEFINE_NEON_STACK_REGS(a, num) \
> + struct { \
> + struct fpsimd_partial_state regs; \
> + __uint128_t vregs[(num) > 32 ? 32 : ((num) + 1) & ~1U]; \
> + } a = { .regs.num_regs = sizeof(a.vregs) / sizeof(__uint128_t) }
> +
> +#define DEFINE_NEON_STACK_REGS_ALL(name) DEFINE_NEON_STACK_REGS(name, 32)
> +
> void kernel_neon_begin(void);
> void kernel_neon_end(void);
> +
> +static inline void __kernel_neon_begin_atomic(struct fpsimd_partial_state *regs)
> +{
> + if (!in_interrupt())
> + kernel_neon_begin();
> + else
> + fpsimd_save_partial_state(regs);
> +}
> +
> +static inline void __kernel_neon_end_atomic(struct fpsimd_partial_state *regs)
> +{
> + if (!in_interrupt())
> + kernel_neon_end();
> + else
> + fpsimd_load_partial_state(regs);
> +}
The _atomic suffix is a bit misleading (you basically mean no user
context). I wonder whether it's better to have some _fast/_slow variants
instead. Looking at the other two patches, you only need 2 or 4
registers to do the crypto stuff but if you are not in_interrupt(), you
basically save and restore the full NEON bank. I would say for such
cases just make kernel_neon_begin_fast() call which is safe in all
contexts and much faster.
--
Catalin
More information about the linux-arm-kernel
mailing list