[RFC PATCH 1/2] arm64: jump_label: use more precise asm constraints

Ard Biesheuvel ardb at kernel.org
Thu Apr 28 02:35:51 PDT 2022


On Wed, 27 Apr 2022 at 23:50, Ard Biesheuvel <ardb at kernel.org> wrote:
>
> Hi Nick,
>
> On Wed, 27 Apr 2022 at 20:59, Nick Desaulniers <ndesaulniers at google.com> wrote:
> >
> > On Wed, Apr 27, 2022 at 10:13 AM Ard Biesheuvel <ardb at kernel.org> wrote:
> > >
> > > In order to set bit #0 of the struct static_key pointer in the the jump
> > > label struct, we currently cast the pointer to char[], and take the
> > > address of either the 0th or 1st array member, depending on the value of
> > > 'branch'. This works but creates problems with -fpie code generation:
> > > GCC complains about the constraint being unsatisfiable, and Clang
> > > miscompiles the code in a way that causes stability issues (immediate
> > > panic on 'attempt to kill init')
> >
> > Any more info on the "miscompile?" Perhaps worth an upstream bug report?
> >
>
> I made very little progress trying to narrow it down: a segfault in
> user space caused by who knows what. I'd file a bug if I knew how to
> reproduce it.
>

... and now, I cannot even reproduce it anymore, so I'll drop this
mention altogether.

> > >
> > > So instead, pass the struct static_key reference and the 'branch'
> > > immediate individually, in a way that satisfies both GCC and Clang (GCC
> > > wants the 'S' constraint, whereas Clang wants the 'i' constraint for
> > > argument %0)
> >
> > Anything we can do to improve Clang's handling of S constraints?
> > https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints
> > >> An absolute symbolic address or a label reference
> >
>
> "S" seems more appropriate than "i" for a symbol reference, and GCC
> rejects "i" when using -fpie. But the actual literal being emitted is
> a relative reference, not an absolute one. This likely needs more
> discussion, but using "i" in the way we do now is definitely dodgy.
>

I have tried to create a reproducer for this issue, but the code
below, which does essentially the same thing as jump_label.h, works
fine with Clang and GCC with or without -fpie.

extern struct s {
  struct { } m;
} ss;

static inline __attribute__((always_inline)) int inner(struct s *s) {
  asm goto("adrp xzr, %c0" : : "S"(&s->m) : : l);

  return 0;
l:return 1;
}

int outer(void) {
  return inner(&ss);
}

So what's tricky here is that arch_static_branch() [like inner()
above] does not refer to the global directly, which is either struct
static_key_true or struct static_key_false, but to its 'key' member,
which is the first member of either former type. However, Clang does
not seem to care in case of this example, but when building the
kernel, it errors out with

/home/ard/linux/arch/arm64/include/asm/jump_label.h:22:3: error:
invalid operand for inline asm constraint 'S'

Another thing I hate about this code is that it needs optimizations
enabled, or it won't compile.

> > >
> > > Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
> > > ---
> > >  arch/arm64/include/asm/jump_label.h | 8 ++++----
> > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/arch/arm64/include/asm/jump_label.h b/arch/arm64/include/asm/jump_label.h
> > > index cea441b6aa5d..f741bbacf175 100644
> > > --- a/arch/arm64/include/asm/jump_label.h
> > > +++ b/arch/arm64/include/asm/jump_label.h
> > > @@ -23,9 +23,9 @@ static __always_inline bool arch_static_branch(struct static_key *key,
> > >                  "      .pushsection    __jump_table, \"aw\"    \n\t"
> > >                  "      .align          3                       \n\t"
> > >                  "      .long           1b - ., %l[l_yes] - .   \n\t"
> > > -                "      .quad           %c0 - .                 \n\t"
> > > +                "      .quad           %c0 - . + %1            \n\t"
> >
> > If %1 is "i" constrained, then I think we can use the %c output
> > template for it as well?
> > https://gcc.gnu.org/onlinedocs/gccint/Output-Template.html
> >
>
> Is %c what prevents the leading # from being emitted? I'm not sure
> whether that matters here, as AArch64 asm does not require it (and the
> code builds fine with either compiler). But if it reflects the use
> more precisely, I agree we should add it.
>
> > Is the expression clearer if we keep the `- .` at the end of each expression?
> >
>
> I don't think so. The add sets the enabled bit, so I'd even be
> inclined to write this as (%c0 - .) + %c1 to emphasize that this is a
> relative reference with bit #0 set separately.
>
> > >                  "      .popsection                             \n\t"
> > > -                :  :  "i"(&((char *)key)[branch]) :  : l_yes);
> > > +                :  :  "Si"(key), "i"(branch) :  : l_yes);
> > >
> > >         return false;
> > >  l_yes:
> > > @@ -40,9 +40,9 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key,
> > >                  "      .pushsection    __jump_table, \"aw\"    \n\t"
> > >                  "      .align          3                       \n\t"
> > >                  "      .long           1b - ., %l[l_yes] - .   \n\t"
> > > -                "      .quad           %c0 - .                 \n\t"
> > > +                "      .quad           %c0 - . + %1            \n\t"
> > >                  "      .popsection                             \n\t"
> > > -                :  :  "i"(&((char *)key)[branch]) :  : l_yes);
> > > +                :  :  "Si"(key), "i"(branch) :  : l_yes);
> > >
> > >         return false;
> > >  l_yes:
> > > --
> > > 2.30.2
> > >
> >
> >
> > --
> > Thanks,
> > ~Nick Desaulniers



More information about the linux-arm-kernel mailing list