[PATCH resend v2 0/2] preparatory arm64 asm patches for yielding the NEON

Ard Biesheuvel ard.biesheuvel at linaro.org
Thu Mar 29 06:13:21 PDT 2018


The RT people reported that the arm64 crypto NEON code behaves poorly in RT
context, because it disables preemption (to avoid having to context switch
the NEON registers) and usually processes the entire input in one go. When we
introduced this code, this was not unreasonable given the overhead of eager
preserve/restore, but today, there isn't that much overhead anymore, and so
we can consider approaches that have much better worst case scheduling latency.

Simply refactoring the code to only call into the core NEON transform one
block at a time results in a non-negligible performance impact, especially
on low end cores such as Cortex-A53 where memory accesses are relatively
costly. So instead, let's introduce some infrastructure to allow assembler
routines to do a conditional yield, i.e., check the TIF_NEED_RESCHED flag
after processing each block of input, and yield if it is set, in which case
some context may need to be preserved and restored, and or constant tables
reloaded.

Changes since v1:
- incorporate Dave's review feedback and add his Reviewed-bys
  . enhance non-nesting check in frame_push/_pop (#1)
  . describe cond_yield_neon convenience macro (#2)
  . discard yield sequence if CONFIG_PREEMPT=n (#2)
  . add missing include of linux/preempt.h (#2)

Patch #1 adds helper macros to create standard AAPCS stack frames. This is
needed because the assembler code will be modified to call into schedule()
[essentially], and so a stack frame is needed to preserve state.

Patch #2 adds helper macros to create the yielding code: check whether a
yield should be done, and preserve/restore the algorithm specific pieces
that will not be preserved across the yield in the NEON registers.

These patches have been broken out from the arm64/crypto series and resent
since they require careful review from the arm64 maintainers, rather than
pulled silently via the crypto tree (which already happened by accident and
got reverted)

Ard Biesheuvel (2):
  arm64: assembler: add utility macros to push/pop stack frames
  arm64: assembler: add macros to conditionally yield the NEON under
    PREEMPT

 arch/arm64/include/asm/assembler.h | 136 ++++++++++++++++++++
 arch/arm64/kernel/asm-offsets.c    |   3 +
 2 files changed, 139 insertions(+)

-- 
2.11.0




More information about the linux-arm-kernel mailing list