[PATCH v3 10/20] arm64: assembler: add utility macros to push/pop stack frames
Ard Biesheuvel
ard.biesheuvel at linaro.org
Thu Dec 7 06:58:43 PST 2017
On 7 December 2017 at 14:53, Dave Martin <Dave.Martin at arm.com> wrote:
> On Thu, Dec 07, 2017 at 02:21:17PM +0000, Ard Biesheuvel wrote:
>> On 7 December 2017 at 14:11, Dave Martin <Dave.Martin at arm.com> wrote:
>> > On Wed, Dec 06, 2017 at 07:43:36PM +0000, Ard Biesheuvel wrote:
>> >> We are going to add code to all the NEON crypto routines that will
>> >> turn them into non-leaf functions, so we need to manage the stack
>> >> frames. To make this less tedious and error prone, add some macros
>> >> that take the number of callee saved registers to preserve and the
>> >> extra size to allocate in the stack frame (for locals) and emit
>> >> the ldp/stp sequences.
>> >>
>> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel at linaro.org>
>> >> ---
>> >> arch/arm64/include/asm/assembler.h | 60 ++++++++++++++++++++
>> >> 1 file changed, 60 insertions(+)
>> >>
>> >> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
>> >> index aef72d886677..5f61487e9f93 100644
>> >> --- a/arch/arm64/include/asm/assembler.h
>> >> +++ b/arch/arm64/include/asm/assembler.h
>> >> @@ -499,6 +499,66 @@ alternative_else_nop_endif
>> >> #endif
>> >> .endm
>> >>
>> >> + /*
>> >> + * frame_push - Push @regcount callee saved registers to the stack,
>> >> + * starting at x19, as well as x29/x30, and set x29 to
>> >> + * the new value of sp. Add @extra bytes of stack space
>> >> + * for locals.
>> >> + */
>> >> + .macro frame_push, regcount:req, extra
>> >> + __frame st, \regcount, \extra
>> >> + .endm
>> >> +
>> >> + /*
>> >> + * frame_pop - Pop @regcount callee saved registers from the stack,
>> >> + * starting at x19, as well as x29/x30. Also pop @extra
>> >> + * bytes of stack space for locals.
>> >> + */
>> >> + .macro frame_pop, regcount:req, extra
>> >> + __frame ld, \regcount, \extra
>> >> + .endm
>> >> +
>> >> + .macro __frame, op, regcount:req, extra=0
>> >> + .ifc \op, st
>> >> + stp x29, x30, [sp, #-((\regcount + 3) / 2) * 16 - \extra]!
>> >> + mov x29, sp
>> >> + .endif
>> >> + .if \regcount < 0 || \regcount > 10
>> >> + .error "regcount should be in the range [0 ... 10]"
>> >> + .endif
>> >> + .if (\extra % 16) != 0
>> >> + .error "extra should be a multiple of 16 bytes"
>> >> + .endif
>> >> + .if \regcount > 1
>> >> + \op\()p x19, x20, [sp, #16]
>> >> + .if \regcount > 3
>> >> + \op\()p x21, x22, [sp, #32]
>> >> + .if \regcount > 5
>> >> + \op\()p x23, x24, [sp, #48]
>> >> + .if \regcount > 7
>> >> + \op\()p x25, x26, [sp, #64]
>> >> + .if \regcount > 9
>> >> + \op\()p x27, x28, [sp, #80]
>> >
>> > Can the _for thing I introduced in fpsimdmacros.h be any use here?
>> > Alternatively, the following could replace that .if-slide,
>> > providing the calling macro does .altmacro .. .noaltmacro somewhere.
>> >
>> > .macro _pushpop2 op, n1, n2, offset
>> > \op x\n1, x\n2, [sp, #\offset]
>> > .endm
>> >
>> > .macro _pushpop op, first, last, offset
>> > .if \first < \last
>> > _pushpop2 \op\()p, \first, %\first + 1, \offset
>> > _pushpop \op, %\first + 2, \last, %\offset + 16
>> > .elseif \first == \last
>> > \op\()r x\first, [sp, #\offset]
>> > .endif
>> > .endm
>> >
>>
>> I'd prefer not to rely on altmacro, for reasons you pointed out
>> yourself a while ago IIRC.
>>
>> I agree your version is more compact, but for a write once thing, I'm
>> not sure if it matters.
>>
>> > Also, I wonder whether it would be more readable at the call site
>> > to specify the first and last reg numbers instead of the reg count,
>> > e.g.:
>> >
>> > frame_push first_reg=19, last_reg=23
>> >
>> > (or whatever). Just syntactic sugar though.
>> >
>>
>> Again, this will involve arithmetic on macro arguments, which implies
>> altmacro. Relying on altmacro being set is dodgy, and unfortunately,
>> we can't enable it in the macro without keeping it enabled (or we may
>> disable it on behalf of the caller. I guess we could try to come up
>> with a smart way to infer whether altmacro was enabled, and only
>> disable it afterwards if it wasn't, using some directives that get
>> interpreted differently, but to be honest, I factored out this
>> sequence so I could think about more important things :-)
>
> Sure, no worries.
>
> I've changed my mind a bit about .altmacro, in that it is not really
> usable at all unless turned on explicitly, and then off again, only
> where it's needed. So if you just assume it's always off, things are
> sane (and that's what happens in practice).
>
> But it's not really needed here -- my main confusion was with the
> deeply nested .ifs, but perhaps that could be avoided more
> straightforwardly:
>
> .if \regcount > 1
> \op\()p x19, x20, [sp, #16]
> .endif
> .if \regcount > 3
> \op\()p x21, x22, [sp, #32]
> .endif
> // ...
> .if \regcount > 9
> \op\()p x27, x28, [sp, #80]
> .endif
>
> .if \regcount == 1
> \op\()r x19, [sp, #20]
> .endif
> .if \regcount == 3
> \op\()r x21, [sp, #22]
> .endif
> // ...
> .if \regcount == 9
> \op\()r x27, [sp, #28]
> .endif
>
Yes, that does look better.
>
> One other thing, should you be protecting the macro args with ()?
>
> It seems unlikely that an expression would be passed for regcount,
> but for extra it's a bit more plausible.
>
Good point, given that I subtract \extra from the frame size in the ldp case.
> Cheers
> ---Dave
More information about the linux-arm-kernel
mailing list