[PATCH] arm64: mte: avoid clearing PSTATE.TCO on entry unless necessary

Peter Collingbourne pcc at google.com
Fri Nov 5 13:18:38 PDT 2021


On Thu, Oct 7, 2021 at 3:20 AM Mark Rutland <mark.rutland at arm.com> wrote:
>
> On Tue, Oct 05, 2021 at 12:08:58PM -0700, Peter Collingbourne wrote:
> > On Tue, Oct 5, 2021 at 9:46 AM Catalin Marinas <catalin.marinas at arm.com> wrote:
> > > On Wed, Sep 29, 2021 at 12:45:24PM -0700, Peter Collingbourne wrote:
> > > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > > index 2f69ae43941d..85ead6bbb38e 100644
> > > > --- a/arch/arm64/kernel/entry.S
> > > > +++ b/arch/arm64/kernel/entry.S
> > > > @@ -269,7 +269,28 @@ alternative_else_nop_endif
> > > >       .else
> > > >       add     x21, sp, #PT_REGS_SIZE
> > > >       get_current_task tsk
> > > > +     ldr     x0, [tsk, THREAD_SCTLR_USER]
> > > >       .endif /* \el == 0 */
> > > > +
> > > > +     /*
> > > > +      * Re-enable tag checking (TCO set on exception entry). This is only
> > > > +      * necessary if MTE is enabled in either the kernel or the userspace
> > > > +      * task in synchronous mode. With MTE disabled in the kernel and
> > > > +      * disabled or asynchronous in userspace, tag check faults (including in
> > > > +      * uaccesses) are not reported, therefore there is no need to re-enable
> > > > +      * checking. This is beneficial on microarchitectures where re-enabling
> > > > +      * TCO is expensive.
> > > > +      */
> > > > +#ifdef CONFIG_ARM64_MTE
> > > > +alternative_cb       kasan_hw_tags_enable
> > > > +     tbz     x0, #SCTLR_EL1_TCF0_SHIFT, 1f
> > > > +alternative_cb_end
> > > > +alternative_if ARM64_MTE
> > > > +     SET_PSTATE_TCO(0)
> > > > +alternative_else_nop_endif
> > > > +1:
> > > > +#endif
> > >
> > > I think we can get here from an interrupt as well. Can we guarantee that
> > > the sctlr_user is valid? We are not always in a user process context.
> >
> > Looking through the code in entry.S carefully it doesn't appear that
> > the tsk pointer is ever used when taking an exception from EL1. The
> > last user appears to have been removed in commit 3d2403fd10a1 ("arm64:
> > uaccess: remove set_fs()"). I just did a quick boot test on a couple
> > of platforms and things seem to work without the "get_current_task
> > tsk" line.
> >
> > So I can't be confident that tsk points to a valid task, but at least
> > it looks like it was the case prior to that commit.

The EL1 entries unconditionally call enter_from_kernel_mode() which
(at least in some configurations) unconditionally accesses current, so
I'm reasonably confident that we have a valid current pointer here.

> > > Maybe only do the above checks if \el == 0, otherwise just bracket it
> > > with kasan_hw_tags_enable.
> >
> > Is it possible for us to do a uaccess inside the EL1 -> EL1 exception
> > handler? That was the one thing I was unsure about and it's why I
> > opted to do the same check regardless of EL.
>
> Today we can do so if we take a PMU IRQ from EL1 and try to do an EL0
> stack unwind or stack dump, and similar holds for some eBPF stuff.
>
> We also need to ensure that PSTATE.TCO is configured consistently so
> that context-switch works, otherwise where a CPU switches between tasks
> A and B, where A is preempted by an EL1 IRQ, and B is explicitly
> switching via a direct call to schedule(), the state of TCO will not be
> as expected (unless we track this per thread, and handle it in the
> context switch).

Isn't this already context switched via storing the per-task SPSR?
Otherwise e.g. the TCO controls in __uaccess_disable_tco() and
__uaccess_enable_tco() would have the same problem.

> I'd strongly prefer that the state of PSTATE.TCO upon taking an
> exception to EL1 is consistent (by the end of the early entry code),
> regardless of where that was taken from.
>
> Is it easier to manage this from within entry-common.c?

It doesn't look like we have any common location to add code that runs
on every kernel entry in entry-common.c. I tried adding one (patch
below), but this ended up leading to a performance regression (versus
baseline) on some of the cores on the hardware that I have access to.

Peter

diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 64cfe4a3798f..ad066b09a6b7 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -17,6 +17,23 @@
 #include <asm/mmu.h>
 #include <asm/sysreg.h>

+static void mte_disable_tco_entry(void)
+{
+ /*
+ * Re-enable tag checking (TCO set on exception entry). This is only
+ * necessary if MTE is enabled in either the kernel or the userspace
+ * task in synchronous mode. With MTE disabled in the kernel and
+ * disabled or asynchronous in userspace, tag check faults (including in
+ * uaccesses) are not reported, therefore there is no need to re-enable
+ * checking. This is beneficial on microarchitectures where re-enabling
+ * TCO is expensive.
+ */
+ if (IS_ENABLED(CONFIG_ARM64_MTE) &&
+     (kasan_hw_tags_enabled() ||
+      (current->thread.sctlr_user & (1UL << SCTLR_EL1_TCF0_SHIFT))))
+ asm volatile(SET_PSTATE_TCO(0));
+}
+
 /*
  * This is intended to match the logic in irqentry_enter(), handling the kernel
  * mode transitions only.
@@ -39,6 +56,7 @@ static void noinstr enter_from_kernel_mode(struct
pt_regs *regs)
  trace_hardirqs_off_finish();

  mte_check_tfsr_entry();
+ mte_disable_tco_entry();
 }

 /*
@@ -235,6 +253,7 @@ asmlinkage void noinstr enter_from_user_mode(void)
  CT_WARN_ON(ct_state() != CONTEXT_USER);
  user_exit_irqoff();
  trace_hardirqs_off_finish();
+ mte_disable_tco_entry();
 }

 asmlinkage void noinstr exit_to_user_mode(void)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 42c404d508ca..8c63279ffea7 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -339,13 +339,6 @@ alternative_if ARM64_HAS_IRQ_PRIO_MASKING
  msr_s SYS_ICC_PMR_EL1, x20
 alternative_else_nop_endif

- /* Re-enable tag checking (TCO set on exception entry) */
-#ifdef CONFIG_ARM64_MTE
-alternative_if ARM64_MTE
- SET_PSTATE_TCO(0)
-alternative_else_nop_endif
-#endif
-
  /*
  * Registers that may be useful after this macro is invoked:
  *



More information about the linux-arm-kernel mailing list