[PATCH v2 11/19] arm64: assembler: add macro to conditionally yield the NEON under PREEMPT

Dave Martin Dave.Martin at arm.com
Wed Dec 6 03:51:37 PST 2017


On Tue, Dec 05, 2017 at 06:04:34PM +0000, Ard Biesheuvel wrote:
> On 5 December 2017 at 12:45, Ard Biesheuvel <ard.biesheuvel at linaro.org> wrote:
> >
> >
> >> On 5 Dec 2017, at 12:28, Dave Martin <Dave.Martin at arm.com> wrote:
> >>
> >>> On Mon, Dec 04, 2017 at 12:26:37PM +0000, Ard Biesheuvel wrote:
> >>> Add a support macro to conditionally yield the NEON (and thus the CPU)
> >>> that may be called from the assembler code. Given that especially the
> >>> instruction based accelerated crypto code may use very tight loops, add
> >>> some parametrization so that the TIF_NEED_RESCHED flag test is only
> >>> executed every so many loop iterations.
> >>>
> >>> In some cases, yielding the NEON involves saving and restoring a non
> >>> trivial amount of context (especially in the CRC folding algorithms),
> >>> and so the macro is split into two, and the code in between is only
> >>> executed when the yield path is taken, allowing the contex to be preserved.
> >>> The second macro takes a label argument that marks the resume-from-yield
> >>> path, which should restore the preserved context again.
> >>>
> >>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel at linaro.org>
> >>> ---
> >>> arch/arm64/include/asm/assembler.h | 50 ++++++++++++++++++++
> >>> 1 file changed, 50 insertions(+)
> >>>
> >>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> >>> index aef72d886677..917b026d3e00 100644
> >>> --- a/arch/arm64/include/asm/assembler.h
> >>> +++ b/arch/arm64/include/asm/assembler.h
> >>> @@ -512,4 +512,54 @@ alternative_else_nop_endif
> >>> #endif
> >>>    .endm
> >>>
> >>> +/*
> >>> + * yield_neon - check whether to yield to another runnable task from
> >>> + *        kernel mode NEON code (running with preemption disabled)
> >>> + *
> >>> + * - Check whether the preempt count is exactly 1, in which case disabling
> >>> + *   preemption once will make the task preemptible. If this is not the case,
> >>> + *   yielding is pointless.
> >>> + * - Check whether TIF_NEED_RESCHED is set, and if so, disable and re-enable
> >>> + *   kernel mode NEON (which will trigger a reschedule), and branch to the
> >>> + *   yield fixup code at @lbl.
> >>> + */
> >>> +    .macro        yield_neon, lbl:req, ctr, order, stride, loop
> >>> +    yield_neon_pre    \ctr, \order, \stride, \loop
> >>> +    yield_neon_post    \lbl
> >>> +    .endm
> >>> +
> >>> +    .macro        yield_neon_pre, ctr, order=0, stride, loop=4444f
> >>> +#ifdef CONFIG_PREEMPT
> >>> +    /*
> >>> +     * With some algorithms, it makes little sense to poll the
> >>> +     * TIF_NEED_RESCHED flag after every iteration, so only perform
> >>> +     * the check every 2^order strides.
> >>> +     */
> >>> +    .if        \order > 1
> >>> +    .if        (\stride & (\stride - 1)) != 0
> >>> +    .error        "stride should be a power of 2"
> >>> +    .endif
> >>> +    tst        \ctr, #((1 << \order) * \stride - 1) & ~(\stride - 1)
> >>> +    b.ne        \loop
> >>> +    .endif
> >>
> >> I'm not sure what baking in this check gives us, and this seems to
> >> conflate two rather different things: yielding and defining a
> >> "heartbeat" frequency for the calling code.
> >>
> >> Can we separate out the crypto-loop-helper semantics from the yield-
> >> neon stuff?
> >>
> >
> > Fair enough. I incorporated the check here so it disappears from the code entirely when !CONFIG_PREEMPT, because otherwise, you end up with a sequence that is mispredicted every # iterations without any benefit.
> > I guess i could macroise that separately though.
> >
> >> If we had
> >>
> >>    if_cond_yield_neon
> >>    // patchup code
> >>    endif_yield_neon
> >>
> 
> I like this, but ...
> 
> >> then the caller is free to conditionally branch over that as appropriate
> >> like
> >>
> >> loop:
> >>    // crypto stuff
> >>    tst x0, #0xf
> >>    b.ne    loop
> >>
> >>    if_cond_yield_neon
> >>    // patchup code
> >>    endif_cond_yield_neon
> >>
> 
> I need to put the post patchup code somewhere too. Please refer to the
> CRC patches for the best examples of this.

I'm not sure I follow.  If you need to do something different after
patchup, can't you either pull that code into the if...endif too, or
otherwise put a branch before the endif?

[...]

Cheers
---Dave



More information about the linux-arm-kernel mailing list