[PATCH v2 0/9] arm64: rework NEON yielding to avoid scheduling from asm code

Herbert Xu herbert at gondor.apana.org.au
Wed Feb 10 02:23:07 EST 2021


On Wed, Feb 03, 2021 at 12:36:17PM +0100, Ard Biesheuvel wrote:
> Given how kernel mode NEON code disables preemption (to ensure that the
> FP/SIMD register state is protected without having to context switch it),
> we need to take care not to let those algorithms operate on unbounded
> input data, or we may end up with excessive scheduling blackouts on
> CONFIG_PREEMPT kernels.
> 
> This is currently handled by the cond_yield_neon macros, which check the
> preempt count and the TIF_NEED_RESCHED flag from assembler code, and call
> into kernel_neon_end()+kernel_neon_begin(), triggering a reschedule.
> This works as expected, but is a bit messy, given how much of the state
> preserve/restore code in the algorithm needs to be duplicated, as well as
> causing the need to manage the stack frame explicitly. All of this is better
> handled by the compiler, especially now that we have enabled features such
> as the shadow call stack and BTI, and are working to improve call stack
> validation.
> 
> In some cases, yielding is not necessary at all: algoritms that implement
> skciphers and use the skcipher walk API will be invoked at page granularity,
> which is granular enough for our purpose.
> 
> In other cases, it is better to simply return early from the assembler
> routine if a reschedule is pending, and let the C code handle with this, by
> retrying the call until it completes. This removes any voluntary schedule()
> calls from the call stack, making the code much easier to reason about in
> the context of stack validation, rcu_tasks synchronization, etc.
> 
> Practical note: assuming there are no objections to these changes, it may
> be the most convenient to take patch #1 into the arm64 tree for v5.12,
> and postpone the rest for merging via the crypto tree. (Note that this
> series was created against the cryptodev tree, and so the arm64 maintainers
> are also welcome to take the whole set if it applies cleanly to the arm64
> tree)
> 
> Will: if you stick #1 on a separate branch, please base it on v5.11-rc1
> 
> Changes since v1:
> - use sub+cbz instead of cmp+b.eq to avoid clobbering the flags in cond_yield
>   (patch #1)
> 
> Cc: Dave Martin <dave.martin at arm.com>
> Cc: Eric Biggers <ebiggers at google.com>
> 
> Ard Biesheuvel (9):
>   arm64: assembler: add cond_yield macro
>   crypto: arm64/sha1-ce - simplify NEON yield
>   crypto: arm64/sha2-ce - simplify NEON yield
>   crypto: arm64/sha3-ce - simplify NEON yield
>   crypto: arm64/sha512-ce - simplify NEON yield
>   crypto: arm64/aes-neonbs - remove NEON yield calls
>   crypto: arm64/aes-ce-mac - simplify NEON yield
>   crypto: arm64/crc-t10dif - move NEON yield to C code
>   arm64: assembler: remove conditional NEON yield macros
> 
>  arch/arm64/crypto/aes-glue.c          | 21 +++--
>  arch/arm64/crypto/aes-modes.S         | 52 +++++--------
>  arch/arm64/crypto/aes-neonbs-core.S   |  8 +-
>  arch/arm64/crypto/crct10dif-ce-core.S | 43 +++--------
>  arch/arm64/crypto/crct10dif-ce-glue.c | 30 ++++++--
>  arch/arm64/crypto/sha1-ce-core.S      | 47 ++++--------
>  arch/arm64/crypto/sha1-ce-glue.c      | 22 +++---
>  arch/arm64/crypto/sha2-ce-core.S      | 38 ++++-----
>  arch/arm64/crypto/sha2-ce-glue.c      | 22 +++---
>  arch/arm64/crypto/sha3-ce-core.S      | 81 ++++++++------------
>  arch/arm64/crypto/sha3-ce-glue.c      | 14 ++--
>  arch/arm64/crypto/sha512-ce-core.S    | 29 ++-----
>  arch/arm64/crypto/sha512-ce-glue.c    | 53 +++++++------
>  arch/arm64/include/asm/assembler.h    | 78 +++----------------
>  14 files changed, 209 insertions(+), 329 deletions(-)

Patches 2-8 applied.  Thanks.
-- 
Email: Herbert Xu <herbert at gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt



More information about the linux-arm-kernel mailing list