[PATCH] arm64: add prctl(PR_PAC_SET_ENABLED_KEYS)

Wed Aug 26 22:55:51 EDT 2020

On Wed, Aug 26, 2020 at 9:37 AM Dave Martin <Dave.Martin at arm.com> wrote:
>
> On Mon, Aug 24, 2020 at 05:23:27PM -0700, Peter Collingbourne wrote:
> > On Mon, Aug 24, 2020 at 7:49 AM Dave Martin <Dave.Martin at arm.com> wrote:
> > >
> > > On Wed, Aug 19, 2020 at 02:25:45PM -0700, Peter Collingbourne wrote:
> > > > On Wed, Aug 19, 2020 at 3:18 AM Dave Martin <Dave.Martin at arm.com> wrote:
> > > > >
> > > > > On Fri, Jul 31, 2020 at 06:11:52PM -0700, Peter Collingbourne wrote:
> > > > > > This prctl allows the user program to control which PAC keys are enabled
> > > > > > in a particular task. The main reason why this is useful is to enable a
> > > > > > userspace ABI that uses PAC to sign and authenticate function pointers
> > > > > > and other pointers exposed outside of the function, while still allowing
> > > > > > binaries conforming to the ABI to interoperate with legacy binaries that
> > > > > > do not sign or authenticate pointers.
> > > > > >
> > > > > > The idea is that a dynamic loader or early startup code would issue
> > > > > > this prctl very early after establishing that a process may load legacy
> > > > > > binaries, but before executing any PAC instructions.
> > > > >
> > > > > Apologies for the slow response on this, I'd had it on my list for a
> > > > > while...
> > > > >
> > > > > > ---
> > > > > >  .../arm64/pointer-authentication.rst          | 27 +++++++++++++++
> > > > > >  arch/arm64/include/asm/asm_pointer_auth.h     | 19 +++++++++++
> > > > > >  arch/arm64/include/asm/pointer_auth.h         | 10 ++++--
> > > > > >  arch/arm64/include/asm/processor.h            |  5 +++
> > > > > >  arch/arm64/kernel/asm-offsets.c               |  1 +
> > > > > >  arch/arm64/kernel/pointer_auth.c              | 34 +++++++++++++++++++
> > > > > >  include/uapi/linux/prctl.h                    |  3 ++
> > > > > >  kernel/sys.c                                  |  8 +++++
> > > > > >  8 files changed, 105 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/Documentation/arm64/pointer-authentication.rst b/Documentation/arm64/pointer-authentication.rst
> > > > > > index 30b2ab06526b..1f7e064deeb3 100644
> > > > > > --- a/Documentation/arm64/pointer-authentication.rst
> > > > > > +++ b/Documentation/arm64/pointer-authentication.rst
> > > > > > @@ -107,3 +107,30 @@ filter out the Pointer Authentication system key registers from
> > > > > >  KVM_GET/SET_REG_* ioctls and mask those features from cpufeature ID
> > > > > >  register. Any attempt to use the Pointer Authentication instructions will
> > > > > >  result in an UNDEFINED exception being injected into the guest.
> > > > > > +
> > > > > > +
> > > > > > +Enabling and disabling keys
> > > > > > +---------------------------
> > > > > > +
> > > > > > +The prctl PR_PAC_SET_ENABLED_KEYS allows the user program to control which
> > > > > > +PAC keys are enabled in a particular task. It takes two arguments, the
> > > > > > +first being a bitmask of PR_PAC_APIAKEY, PR_PAC_APIBKEY, PR_PAC_APDAKEY
> > > > > > +and PR_PAC_APDBKEY specifying which keys shall be affected by this prctl,
> > > > > > +and the second being a bitmask of the same bits specifying whether the key
> > > > > > +should be enabled or disabled. For example::
> > > > > > +
> > > > > > +  prctl(PR_PAC_SET_ENABLED_KEYS,
> > > > > > +        PR_PAC_APIAKEY | PR_PAC_APIBKEY | PR_PAC_APDAKEY | PR_PAC_APDBKEY,
> > > > > > +        PR_PAC_APIBKEY, 0, 0);
> > > > > > +
> > > > > > +disables all keys except the IB key.
> > > > > > +
> > > > > > +The main reason why this is useful is to enable a userspace ABI that uses PAC
> > > > > > +instructions to sign and authenticate function pointers and other pointers
> > > > > > +exposed outside of the function, while still allowing binaries conforming to
> > > > > > +the ABI to interoperate with legacy binaries that do not sign or authenticate
> > > > > > +pointers.
> > > > >
> > > > > What actually breaks without this?
> > > > >
> > > > > Since the keys are all enabled by default, the only purpose of this
> > > > > prctl seems to be to disable keys.  I'm not sure what this is giving us.
> > > >
> > > > Yes, the purpose is to disable keys. Let's consider the function
> > > > pointer signing userspace ABI use case. An example is Apple's arm64e
> > > > ABI, and I have a prototype branch of LLVM [0] that implements a
> > > > similar ABI in Linux userspace, based on Apple's implementation of
> > > > their ABI.
> > > >
> > > > Here's an example of a function that returns a function pointer, and a
> > > > function that calls a function pointer of the same type:
> > > >
> > > > static void f(void) {}
> > > >
> > > > void *return_fp(void) {
> > > >   return f;
> > > > }
> > > >
> > > > void call_fp(void (*p)(void)) {
> > > >   p();
> > > > }
> > > >
> > > > If I compile this with my prototype compiler I get:
> > > >
> > > > $ clang --target=aarch64-android   -fptrauth-calls  fptr.c -S -o - -O3
> > > > -march=armv8.3a
> > > > [...]
> > > > return_fp:                              // @return_fp
> > > > // %bb.0:                               // %entry
> > > >         adrp    x16, f
> > > >         add     x16, x16, :lo12:f
> > > >         mov     x17, #16277
> > > >         pacia   x16, x17
> > > >         mov     x0, x16
> > > >         ret
> > > > [...]
> > > > call_fp:                                // @call_fp
> > > > // %bb.0:                               // %entry
> > > >         mov     w1, #16277
> > > >         braa    x0, x1
> > > > [...]
> > > >
> > > > In this code snippet the function pointer is signed with the IA key
> > > > and discriminator 16277 before being returned. When the function is
> > > > called, the pointer is first authenticated with the same key and
> > > > discriminator.
> > > >
> > > > Now imagine that this code lives in a shared library used both by
> > > > programs that use the function pointer signing ABI and by legacy
> > > > binaries (i.e. programs that use the existing ABI), and we have a
> > > > legacy binary that calls return_fp. If the legacy binary then calls
> > > > the function pointer returned by return_fp, that code will not
> > > > authenticate the pointer before calling it, it will just use a br or
> > > > blr instruction to call it directly, which will lead to a crash if the
> > > > signature bits are set in the function pointer. In order to prevent
> > > > the crash, we need a way to cause the pacia instruction in return_fp
> > > > to become a no-op when running inside the process hosting the legacy
> > > > binary, so that the signature bits will remain clear and the br or blr
> > > > instruction in the legacy binary will successfully call the function
> > > > f. That can be done by disabling the IA key, which is exactly what
> > > > this prctl() lets us do.
> > >
> > > Ack, I think there has been past discussion on this, but it's been
> > > waiting for a usecase since it does impose a bit of extra overhead.
> > >
> > > (I'd somehow assumes that you were actually wanting to get SIGILLs,
> > > rather then compatibly executing some ptrauth insns as NOPs -- the
> > > latter makes much more sense.)
> > >
> > >
> > > Is this going to land for real, or is it just something people have
> > > been experimenting with?
> > >
> > > (I can guess the answer from the fact that you've proposed this patch
> > > in the first place, but I have to ask...)
> >
> > It's something that we're considering for a future revision of the
> > Android ABI, and something that I'm personally interested in having,
> > but no final decision has been made.
> >
> > One problem is that by the time we're ready to make a decision, it'll
> > probably be far too late to make a start on getting the necessary
> > kernel support added, hence the desire to get started on this early.
>
> Fair enough.  It's far from being pure fantasy at least, and the use
> case seems sufficiently real.
>
> >
> > > >
> > > > >
> > > > > > +
> > > > > > +The idea is that a dynamic loader or early startup code would issue this
> > > > > > +prctl very early after establishing that a process may load legacy binaries,
> > > > > > +but before executing any PAC instructions.
> > > > > > diff --git a/arch/arm64/include/asm/asm_pointer_auth.h b/arch/arm64/include/asm/asm_pointer_auth.h
> > > > > > index 52dead2a8640..d121fa5fed5f 100644
> > > > > > --- a/arch/arm64/include/asm/asm_pointer_auth.h
> > > > > > +++ b/arch/arm64/include/asm/asm_pointer_auth.h
> > > > > > @@ -31,6 +31,14 @@ alternative_else_nop_endif
> > > > > >       ldp     \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APDB]
> > > > > >       msr_s   SYS_APDBKEYLO_EL1, \tmp2
> > > > > >       msr_s   SYS_APDBKEYHI_EL1, \tmp3
> > > > > > +
> > > > > > +     ldr     \tmp2, [\tsk, #THREAD_SCTLR_ENXX_MASK]
> > > > > > +     cbz     \tmp2, .Laddr_auth_skip_\@
> > > > >
> > > > > I wonder whether it would make sense to simple store the thread's base
> > > > > SCTLR value (containing the ENxx bits), rather than storing the ENxx
> > > > > bits separately.  There may be reasons outside this snippet why this
> > > > > isn't such a good idea though -- I haven't gone digging so far.
> > > >
> > > > As far as I know (as I learned [1] from the MTE patch series), it can
> > > > be expensive to access SCTLR_EL1, therefore I arrange to only access
> > > > SCTLR_EL1 in the hopefully-uncommon case where a process has disabled
> > > > keys. Detecting the "process has disabled keys" case is quite simple
> > > > if we only store the disabled keys mask here, not so much if we store
> > > > the full value of SCTLR_EL1.
> > >
> > > My thought was that we would still only write SCTLR_EL1 if needed, but
> > > we would do the write-if-needed across the whole register in one go.
> > > This would be easier to extend if we have to twiddle additional
> > > SCTLR_EL1 bits in the future.  If the key enable bits are the only
> > > affected bits for now then we could of course defer this factoring until
> > > later.  (I vaguely remember a similar discussion in the past, but
> > > possibly it was about the pauth keys anyway, rather than something
> > > else.)
> >
> > We will also need a per-task SCTLR_EL1.TCF0 setting for MTE. We could
> > potentially combine the two, so that both
> > prctl(PR_PAC_SET_ENABLED_KEYS) and prctl(PR_SET_TAGGED_ADDR_CTRL) end
> > up affecting a common SCTLR_EL1 field in the task_struct, and we do:
> >
> > On kernel entry:
> > - read SCTLR_EL1 from task_struct (or from the system register if we
> > decide that's cheap enough)
> > - if EnIA clear, set it, and write the value to the system register.
> >
> > On kernel exit:
> > - read system register SCTLR_EL1
> > - read SCTLR_EL1 from task_struct
> > - if they are different, write task's SCTLR_EL1 to the system register.
> >
> > But this would require at least an unconditional read of SCTLR_EL1 per
> > kernel exit. The comment that I referenced says that "accesses" to the
> > register are expensive. I was operating under the assumption that both
> > reads and writes are expensive, but if it's actually only writes that
> > are expensive, we may be able to get away with the above.
>
> The kernel is in full control over what's in SCTLR_EL1, so perhaps we
> could cache what we wrote percpu, if the overhead of reading it is
> problematic.

Maybe, but that sounds like the kind of complexity that I'd like to avoid.

I went ahead and did some performance measurements using the following
program, which issues 10M invalid syscalls and exits. This should give
us something as close as possible to a measurement of a kernel entry
and exit on its own.

.globl _start
_start:
movz x1, :abs_g1:10000000
movk x1, :abs_g0_nc:10000000

.Lloop:
mov x8, #65536 // invalid
svc #0
sub x1, x1, #1
cbnz x1, .Lloop

mov x0, #0
mov x8, #0x5d // exit
svc #0

On a Pixel 4 (which according to Wikipedia has a CPU based on
Cortex-A76/A55) I measured the median of 1000 runs execution time at
0.554511s, implying an entry/exit cost of 55.5ns. If I make this
modification to the kernel (Pixel 4 uses kernel version 4.14 so this
won't apply cleanly to modern kernels; this is in the same place as
the ptrauth_keys_install_user macro invocation in a modern kernel):

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 303447c765f7..b7c72d767f3e 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -340,6 +340,7 @@ alternative_else_nop_endif
 #endif
 3:
        apply_ssbd 0, 5f, x0, x1
+       mrs x0, sctlr_el1
 5:
        .endif

The median execution time goes to 0.565604s, implying a cost of 1.1ns
to read the system register. I also measured the cost of reading from
a percpu variable at kernel exit like so:

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 303447c765f7..c5a89785f650 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -340,6 +340,7 @@ alternative_else_nop_endif
 #endif
 3:
        apply_ssbd 0, 5f, x0, x1
+       ldr_this_cpu x0, sctlr_el1_cache, x1
 5:
        .endif

diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 0b6ca5479f1f..ac9c9a6179d4 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -52,6 +52,8 @@
 #include <asm/system_misc.h>
 #include <asm/sysreg.h>

+DEFINE_PER_CPU(unsigned long, sctlr_el1_cache);
+
 static const char *handler[]= {
        "Synchronous Abort",
        "IRQ",

With that the median execution time goes to 0.600186s, implying a cost
of 4.5ns. So at least on existing hardware, it looks like reading
SCTLR_EL1 directly is probably our cheapest option if we're going to
read *something*. Therefore I'll look at implementing this with
unconditional reads from SCTLR_EL1 in v2.

> > I could try to measure the cost of an unconditional read from
> > SCTLR_EL1 on kernel exit on the hardware that I have access to, but I
> > don't know if that's going to be representative of all hardware,
> > especially the future hardware that will support PAC and MTE.
> >
> > > In a case like this, we'll still get overheads if there are a mixture of
> > > tasks contending for the CPU, that have different key enable settings.
> > > But I can't see much that we can do about that.  If userspace is mostly
> > > built with the same options (true for the Apple case I guess) then I
> > > guess we shouldn't need SCTLR_EL1 rewrites very often just for this.  In
> > > other environments it may be messier.
> >
> > Yes, I don't think we can avoid the write when switching between tasks
> > with different key enabled settings. If userspace arranges to use IA
> > for return addresses (matching the kernel) and IB for function
> > pointers, there would be no need to disable IA for compatibility with
> > legacy binaries, and so the kernel would not require a write on entry,
> > only on exit when transitioning between different settings. The effect
> > is that disabling IA would be more expensive than disabling the other
> > keys.
> >
> > > >
> > > > > > +
> > > > > > +     mrs_s   \tmp3, SYS_SCTLR_EL1
> > > > > > +     bic     \tmp3, \tmp3, \tmp2
> > > > > > +     msr_s   SYS_SCTLR_EL1, \tmp3
> > > > > > +
> > > > > >  .Laddr_auth_skip_\@:
> > > > > >  alternative_if ARM64_HAS_GENERIC_AUTH
> > > > > >       ldp     \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APGA]
> > > > > > @@ -45,6 +53,17 @@ alternative_else_nop_endif
> > > > > >       ldp     \tmp2, \tmp3, [\tmp1, #PTRAUTH_KERNEL_KEY_APIA]
> > > > > >       msr_s   SYS_APIAKEYLO_EL1, \tmp2
> > > > > >       msr_s   SYS_APIAKEYHI_EL1, \tmp3
> > > > > > +
> > > > > > +     ldr     \tmp2, [\tsk, #THREAD_SCTLR_ENXX_MASK]
> > > > > > +     cbz     \tmp2, .Lset_sctlr_skip_\@
> > > > > > +
> > > > > > +     mrs_s   \tmp1, SYS_SCTLR_EL1
> > > > > > +     mov     \tmp2, #(SCTLR_ELx_ENIA | SCTLR_ELx_ENIB | SCTLR_ELx_ENDA)
> > > > >
> > > > > (Nit: harmless but unnecessary ().  # is not an operator as such, just
> > > > > random syntax.  Whatever follows is greedily parsed as an immediate
> > > > > expression.)
> > > >
> > > > Okay. While looking around in the kernel I noticed that there is a
> > > > mov_q macro that can be used to avoid manually splitting the constant
> > > > into 16-bit chunks, and apparently it doesn't require a #. I'll use it
> > > > in v2.
> > > >
> > > > > > +     movk    \tmp2, #SCTLR_ELx_ENDB
> > > > >
> > > > > Why do we check THREAD_SCTLR_ENXX_MASK, and then proceed to set all the
> > > > > ENxx bits unconditionally?  I may be missing something here.
> > > >
> > > > This code is to support the case where we are returning to the kernel
> > > > from a userspace task with keys disabled. The kernel needs at least
> > > > the IA key enabled in order for its own use of reverse-edge PAC to
> > > > work correctly. When returning from a userspace task with no keys
> > > > disabled, the keys enabled bits already have the correct values, so
> > > > there is nothing to be done (and as mentioned above, I avoid touching
> > > > SCTLR_EL1 unless necessary because it is apparently expensive to do
> > > > so). But in a process with keys disabled, we will need to re-enable at
> > > > least IA.
> > > >
> > > > We may be able to get away with just enabling IA here, but that would
> > > > break the invariant that all keys are enabled inside the kernel, which
> > > > is relied on by the code that decides whether to access SCTLR_EL1 in
> > > > order to disable keys when entering a userspace task.
> > >
> > > OK, I think I just confused myself here: we are not setting the key
> > > enables for userspace, but for the kernel, and we only need to do that
> > > if the user task had some keys disabled in the first place.
> > >
> > > >
> > > > > > +     orr     \tmp1, \tmp1, \tmp2
> > > > > > +     msr_s   SYS_SCTLR_EL1, \tmp1
> > > > > > +
> > > > > > +.Lset_sctlr_skip_\@:
> > > > > >       .endm
> > > > > >
> > > > > >       .macro ptrauth_keys_install_kernel_nosync tsk, tmp1, tmp2, tmp3
> > > > > > diff --git a/arch/arm64/include/asm/pointer_auth.h b/arch/arm64/include/asm/pointer_auth.h
> > > > > > index c6b4f0603024..d4c375454a36 100644
> > > > > > --- a/arch/arm64/include/asm/pointer_auth.h
> > > > > > +++ b/arch/arm64/include/asm/pointer_auth.h
> > > > > > @@ -70,14 +70,19 @@ static __always_inline void ptrauth_keys_switch_kernel(struct ptrauth_keys_kerne
> > > > > >  }
> > > > > >
> > > > > >  extern int ptrauth_prctl_reset_keys(struct task_struct *tsk, unsigned long arg);
> > > > > > +extern int ptrauth_prctl_set_enabled_keys(struct task_struct *tsk,
> > > > > > +                                       unsigned long keys,
> > > > > > +                                       unsigned long enabled);
> > > > > >
> > > > > >  static inline unsigned long ptrauth_strip_insn_pac(unsigned long ptr)
> > > > > >  {
> > > > > >       return ptrauth_clear_pac(ptr);
> > > > > >  }
> > > > > >
> > > > > > -#define ptrauth_thread_init_user(tsk)                                        \
> > > > > > -     ptrauth_keys_init_user(&(tsk)->thread.keys_user)
> > > > > > +#define ptrauth_thread_init_user(tsk) do {                           \
> > > > > > +             ptrauth_keys_init_user(&(tsk)->thread.keys_user);       \
> > > > > > +             (tsk)->thread.sctlr_enxx_mask = 0;                      \
> > > > > > +     } while (0)
> > > > > >  #define ptrauth_thread_init_kernel(tsk)                                      \
> > > > > >       ptrauth_keys_init_kernel(&(tsk)->thread.keys_kernel)
> > > > > >  #define ptrauth_thread_switch_kernel(tsk)                            \
> > > > > > @@ -85,6 +90,7 @@ static inline unsigned long ptrauth_strip_insn_pac(unsigned long ptr)
> > > > > >
> > > > > >  #else /* CONFIG_ARM64_PTR_AUTH */
> > > > > >  #define ptrauth_prctl_reset_keys(tsk, arg)   (-EINVAL)
> > > > > > +#define ptrauth_prctl_set_enabled_keys(tsk, keys, enabled)   (-EINVAL)
> > > > > >  #define ptrauth_strip_insn_pac(lr)   (lr)
> > > > > >  #define ptrauth_thread_init_user(tsk)
> > > > > >  #define ptrauth_thread_init_kernel(tsk)
> > > > > > diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> > > > > > index 240fe5e5b720..6974d227b01f 100644
> > > > > > --- a/arch/arm64/include/asm/processor.h
> > > > > > +++ b/arch/arm64/include/asm/processor.h
> > > > > > @@ -150,6 +150,7 @@ struct thread_struct {
> > > > > >  #ifdef CONFIG_ARM64_PTR_AUTH
> > > > > >       struct ptrauth_keys_user        keys_user;
> > > > > >       struct ptrauth_keys_kernel      keys_kernel;
> > > > > > +     u64                             sctlr_enxx_mask;
> > > > > >  #endif
> > > > > >  };
> > > > > >
> > > > > > @@ -313,6 +314,10 @@ extern void __init minsigstksz_setup(void);
> > > > > >  /* PR_PAC_RESET_KEYS prctl */
> > > > > >  #define PAC_RESET_KEYS(tsk, arg)     ptrauth_prctl_reset_keys(tsk, arg)
> > > > > >
> > > > > > +/* PR_PAC_SET_ENABLED_KEYS prctl */
> > > > > > +#define PAC_SET_ENABLED_KEYS(tsk, keys, enabled)                             \
> > > > > > +     ptrauth_prctl_set_enabled_keys(tsk, keys, enabled)
> > > > > > +
> > > > > >  #ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
> > > > > >  /* PR_{SET,GET}_TAGGED_ADDR_CTRL prctl */
> > > > > >  long set_tagged_addr_ctrl(unsigned long arg);
> > > > > > diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> > > > > > index 0577e2142284..dac80e16fe35 100644
> > > > > > --- a/arch/arm64/kernel/asm-offsets.c
> > > > > > +++ b/arch/arm64/kernel/asm-offsets.c
> > > > > > @@ -47,6 +47,7 @@ int main(void)
> > > > > >  #ifdef CONFIG_ARM64_PTR_AUTH
> > > > > >    DEFINE(THREAD_KEYS_USER,   offsetof(struct task_struct, thread.keys_user));
> > > > > >    DEFINE(THREAD_KEYS_KERNEL, offsetof(struct task_struct, thread.keys_kernel));
> > > > > > +  DEFINE(THREAD_SCTLR_ENXX_MASK,offsetof(struct task_struct, thread.sctlr_enxx_mask));
> > > > > >  #endif
> > > > > >    BLANK();
> > > > > >    DEFINE(S_X0,                       offsetof(struct pt_regs, regs[0]));
> > > > > > diff --git a/arch/arm64/kernel/pointer_auth.c b/arch/arm64/kernel/pointer_auth.c
> > > > > > index 1e77736a4f66..8c385b7f324a 100644
> > > > > > --- a/arch/arm64/kernel/pointer_auth.c
> > > > > > +++ b/arch/arm64/kernel/pointer_auth.c
> > > > > > @@ -42,3 +42,37 @@ int ptrauth_prctl_reset_keys(struct task_struct *tsk, unsigned long arg)
> > > > > >
> > > > > >       return 0;
> > > > > >  }
> > > > > > +
> > > > > > +static u64 arg_to_enxx_mask(unsigned long arg)
> > > > > > +{
> > > > > > +     u64 sctlr_enxx_mask = 0;
> > > > > > +     if (arg & PR_PAC_APIAKEY)
> > > > > > +             sctlr_enxx_mask |= SCTLR_ELx_ENIA;
> > > > > > +     if (arg & PR_PAC_APIBKEY)
> > > > > > +             sctlr_enxx_mask |= SCTLR_ELx_ENIB;
> > > > > > +     if (arg & PR_PAC_APDAKEY)
> > > > > > +             sctlr_enxx_mask |= SCTLR_ELx_ENDA;
> > > > > > +     if (arg & PR_PAC_APDBKEY)
> > > > > > +             sctlr_enxx_mask |= SCTLR_ELx_ENDB;
> > > > > > +     return sctlr_enxx_mask;
> > > > > > +}
> > > > > > +
> > > > > > +int ptrauth_prctl_set_enabled_keys(struct task_struct *tsk, unsigned long keys,
> > > > > > +                                unsigned long enabled)
> > > > > > +{
> > > > > > +     u64 sctlr_enxx_mask = tsk->thread.sctlr_enxx_mask;
> > > > > > +     unsigned long addr_key_mask = PR_PAC_APIAKEY | PR_PAC_APIBKEY |
> > > > > > +                                   PR_PAC_APDAKEY | PR_PAC_APDBKEY;
> > > > > > +
> > > > > > +     if (!system_supports_address_auth())
> > > > > > +             return -EINVAL;
> > > > > > +
> > > > > > +     if ((keys & ~addr_key_mask) || (enabled & ~keys))
> > > > > > +             return -EINVAL;
> > > > >
> > > > > Should we take the types of authentication supported?
> > > > >
> > > > > I don't recall whether we expose ptrauth to userspace if only
> > > > > instruction authentication or only data authentication is supported.  If
> > > > > so, should we reject attempts to configure unsupported keys here?
> > > > >
> > > > > We should probably try to do a consistent thing both here and in
> > > > > PR_PAC_RESET_KEYS if so.
> > > >
> > > > As far as I know, there is nothing in the architecture that would
> > > > allow it to only advertise support for I keys or only advertise
> > > > support for D keys. The fields AA64ISAR1_EL1.AP[AI] apply to all four
> > > > keys: DA, DB, IA and IB. Maybe you are thinking of the GA key versus
> > > > the other keys (which is advertised separately via
> > > > AA64ISAR1_EL1.GP[AI])? The architecture appears to provide no way to
> > > > disable the GA key, so I did not include support for it here.
> > >
> > > I think I'm confusing myself here.  Yes, support for generic auth is
> > > the (supposedly) architecturally orthogonal to address auth, but data
> > > and instruction address auth are either both supported, or both not
> > > supported -- so your code looks correct.
> > >
> > > >
> > > > > > +
> > > > > > +     sctlr_enxx_mask |= arg_to_enxx_mask(keys);
> > > > > > +     sctlr_enxx_mask &= ~arg_to_enxx_mask(enabled);
> > > > > > +
> > > > > > +     tsk->thread.sctlr_enxx_mask = sctlr_enxx_mask;
> > > > > > +     return 0;
> > > > >
> > > > > Do we need a way to query the enabled keys?
> > > > >
> > > > > We could either have a _GET_ prctl (the common approach), or have this
> > > > > prctl return the mask of enabled keys (avoids the extra prctl, but
> > > > > weirder).
> > > > >
> > > > > As above, we might
> > > > >
> > > > > Things like CRIU may need a GET in order to save/restore this setting
> > > > > properly.
> > > >
> > > > Maybe it makes sense for there to be a GET prctl() to support CRIU.
> > > > But it would need to be defined carefully because CRIU would
> > > > presumably need to know what value to pass as the "keys" argument when
> > > > it calls SET to restore the state. It can't just hardcode a value
> > > > because that may harm extensibility if new keys are added.
> > > >
> > > > If we require the kernel to start processes with all keys enabled
> > > > (including any keys that we may introduce in the future), then it
> > > > seems like if CRIU knows which keys were disabled, then it can restore
> > > > the state by issuing a syscall like this:
> > > >
> > > > prctl(PR_PAC_SET_ENABLED_KEYS, disabled_keys_mask, 0)
> > > >
> > > > which would imply that instead of providing PR_PAC_GET_ENABLED_KEYS we
> > > > instead provide PR_PAC_GET_DISABLED_KEYS, which CRIU would call when
> > > > saving the state to prepare the disabled_keys_mask argument passed
> > > > here. Then for consistency we can make the SET prctl() be
> > > > PR_PAC_SET_DISABLED_KEYS, and CRIU can then do:
> > > >
> > > > prctl(PR_PAC_SET_DISABLED_KEYS, disabled_keys_mask, disabled_keys_mask)
> > > >
> > > > Does that make sense?
> > >
> > > Possibly, though it's nicer if the GET returns the same value suppiled
> > > to the SET, rather than the complement of it.
> >
> > Maybe you misread my proposal? The part about the complement is
> > addressed by the sentence beginning "Then for consistency..." So we
> > would end up with PR_PAC_SET_DISABLED_KEYS and
> > PR_PAC_GET_DISABLED_KEYS, and they would both take/return a mask of
> > *disabled* keys.
>
> Oh, I see.  Yes, while I prefer the two are consistent with each other,
> we can flip the sense of both here if desired.
>
> >
> > > If SET ignores set bits in arg3 that don't correspond to set bits in the
> > > mask arg2, then
> > >
> > >         u64 current_keys = prctl(PR_PAC_GET_ENABLED_KEYS);
> > >
> > >         prctl(PR_PAC_SET_ENABLED_KEYS, ~0UL, current_keys);
> > >
> > > should work.
> >
> > I'd prefer not to allow the SET prctl to accept all-ones as the first
> > argument, as this could lead to a scenario where sometimes people pass
> > all-ones as the first argument and sometimes they don't, which could
> > make it hard to add new keys in the future (since the addition of a
> > new key could cause existing callers to affect the new key
> > arbitrarily). Instead, the first argument should be required to
> > specify only known keys, in order to enforce that the caller is
> > explicit about which keys they intend to affect.
>
> That sounds fair.
>
> > > There's a final option, which is to expose this config through ptrace
> > > instead for save/restore purposes.  From previous discussions with the
> > > CRIU folks, I recall that they are trying to move away from doing setup
> > > from within the new process where possible.
> > >
> > > There's no reason not to have both though -- there's precedent for that,
> > > such as for PR_SVE_{GET,SET}_VL and the NT_ARM_SVE regset.  MTE may move
> > > in a similar direction too IIUC.
> > >
> > > Having a GET remains useful for in-process debugging and diagnostics,
> > > and it's extremely straightforward to add in the kernel.  So from my
> > > side I'd vote to have it anyway...
> >
> > Okay. I see that the MTE series is adding NT_ARM_TAGGED_ADDR_CTRL as
> > well, and I suppose that I could add a
> > NT_ARM_PAC_(whatever)ABLED_KEYS. Since this wouldn't be used in normal
> > userspace applications (and it doesn't look like ptrace APIs such as
> > NT_ARM_PACA_KEYS will accommodate new keys being added so CRIU would
> > need to be made aware of new keys anyway), we might as well just have
> > all the APIs deal with the set of enabled keys.
> >
> > And if the ptrace is enough for CRIU, I'd mildly prefer not to add a
> > GET prctl, since it's just more uapi surface that needs to be
> > maintained forever. For in-process use cases it seems redundant with
> > e.g. issuing a pacia instruction and checking whether it is a no-op,
>
> Probing instructions to see whether they generate signals is a cardinal
> sin and we make huge efforts to ensure that userspace doesn't need to do
> that.  As for the no-op check, I suppose that works, but it's nasty if
> the generic "check all the required featues" boilerplate on program
> entry relies on intrinsics or can't be written in C at all.  This is
> just a way of discouraging people from checking at all IMHO.
>
>
> The implementation of that prctl would most likely be a one-line wrapper
> that calls the same internal function used to populate the regset for
> ptrace, so I think you might be worrying a bit too much about creating
> ABI here.  This is more providing a second interface for same thing,
> because no single interface is usable in all situations.

Okay. Since I don't feel too strongly about it, I'll add the GET prctl in v2.

Peter