[PATCH v7] arm64: mte: allow async MTE to be upgraded to sync on a per-CPU basis

Peter Collingbourne pcc at google.com
Wed Jun 30 16:15:18 PDT 2021


On Wed, Jun 30, 2021 at 10:34 AM Catalin Marinas
<catalin.marinas at arm.com> wrote:
>
> On Tue, Jun 29, 2021 at 04:57:04PM -0700, Peter Collingbourne wrote:
> > diff --git a/Documentation/arm64/memory-tagging-extension.rst b/Documentation/arm64/memory-tagging-extension.rst
> > index b540178a93f8..39c04541f57c 100644
> > --- a/Documentation/arm64/memory-tagging-extension.rst
> > +++ b/Documentation/arm64/memory-tagging-extension.rst
> > @@ -77,14 +77,18 @@ configurable behaviours:
> >    address is unknown).
> >
> >  The user can select the above modes, per thread, using the
> > -``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where
> > -``flags`` contain one of the following values in the ``PR_MTE_TCF_MASK``
> > +``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where ``flags``
> > +contains any number of the following values in the ``PR_MTE_TCF_MASK``
> >  bit-field:
> >
> > -- ``PR_MTE_TCF_NONE``  - *Ignore* tag check faults
>
> I'd keep this and maybe add a comment that it will be ignored if
> combined with any other options. IOW, it's already user API and I'd keep
> it together with all the definition in prctl.h (you haven't removed it).

Done.

> >  - ``PR_MTE_TCF_SYNC``  - *Synchronous* tag check fault mode
> >  - ``PR_MTE_TCF_ASYNC`` - *Asynchronous* tag check fault mode
> >
> > +If no modes are specified, tag check faults are ignored. If a single
> > +mode is specified, the program will run in that mode. If multiple
> > +modes are specified, the mode is selected as described in the "Per-CPU
> > +preferred tag checking modes" section below.
> > +
> >  The current tag check fault mode can be read using the
> >  ``prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0)`` system call.
> >
> > @@ -120,13 +124,37 @@ in the ``PR_MTE_TAG_MASK`` bit-field.
> >  interface provides an include mask. An include mask of ``0`` (exclusion
> >  mask ``0xffff``) results in the CPU always generating tag ``0``.
> >
> > +Per-CPU preferred tag checking mode
> > +-----------------------------------
> > +
> > +On some CPUs the performance of MTE in stricter tag checking modes
> > +is similar to that of less strict tag checking modes. This makes it
> > +worthwhile to enable stricter checks on those CPUs when a less strict
> > +checking mode is requested, in order to gain the error detection
> > +benefits of the stricter checks without the performance downsides. To
> > +support this scenario, a privileged user may configure a stricter
> > +tag checking mode as the CPU's preferred tag checking mode.
> > +
> > +The preferred tag checking mode for each CPU is controlled by
> > +``/sys/devices/system/cpu/cpu<N>/mte_tcf_preferred``, to which a
> > +privileged user may write the value ``async`` or ``sync``.  The default
> > +preferred mode for each CPU is ``async``.
> > +
> > +To allow a program to potentially run in the CPU's preferred tag
> > +checking mode, the user program may set multiple tag check fault mode
> > +bits in the ``flags`` argument to the ``prctl(PR_SET_TAGGED_ADDR_CTRL,
> > +flags, 0, 0, 0)`` system call. If the CPU's preferred tag checking mode
> > +is in the task's set of provided tag checking modes, that mode will
> > +be selected. Otherwise, the least strict mode in the set is selected,
> > +where ``async`` is considered less strict than ``sync``.
>
> Specifying async vs sync order at this stage probably doesn't bring
> much, though I don't mind as we'll add asym at some point. We already
> have the default "async" in the mte_tcf_preferred, so async will be
> selected if an app selects both. Also the intersection of task mode and
> mte_tcf_preferred is only void if a single mode is selected by the user,
> in which that case that mode would have to be honoured.
>
> Where it gets more interesting is when we add the asym mode and current
> applications only set sync|async. I expect CPUs that implement sync
> sufficiently fast will continue with this preferred mode. The others may
> likely want asym as a middle ground. With the bit-based order, sync will
> be selected.
>
> We might as well leave this debate until we have asym support since the
> order doesn't matter with only two modes.

I'm not sure I agree about the bit-based order but we don't need to
specify it now, as you mentioned. So I reworded this to avoid
specifying it.

> > +
> >  Initial process state
> >  ---------------------
> >
> >  On ``execve()``, the new process has the following configuration:
> >
> >  - ``PR_TAGGED_ADDR_ENABLE`` set to 0 (disabled)
> > -- Tag checking mode set to ``PR_MTE_TCF_NONE``
> > +- No tag checking modes are selected (tag check faults ignored)
> >  - ``PR_MTE_TAG_MASK`` set to 0 (all tags excluded)
> >  - ``PSTATE.TCO`` set to 0
> >  - ``PROT_MTE`` not set on any of the initial memory maps
>
> I think we should also update the example to show the or'ing of the
> modes into the prctl() flags.

Done.

> > diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> > index 9df3feeee890..6ad7b6c188f4 100644
> > --- a/arch/arm64/include/asm/processor.h
> > +++ b/arch/arm64/include/asm/processor.h
> > @@ -16,6 +16,13 @@
> >   */
> >  #define NET_IP_ALIGN 0
> >
> > +#define MTE_CTRL_GCR_USER_EXCL_SHIFT 0
> > +#define MTE_CTRL_GCR_USER_EXCL_MASK  0xffff
> > +
> > +#define MTE_CTRL_TCF_NONE            (1UL << 16)
>
> Do we need this bit for none? I think we can get rid of it and simplify
> the if/else blocks slightly.

No, I've removed it. I figured maybe we would want this if we wanted
to allow upgrading from NONE, but that doesn't need to be implemented
now.

> > +#define MTE_CTRL_TCF_SYNC            (1UL << 17)
> > +#define MTE_CTRL_TCF_ASYNC           (1UL << 18)
> > +
> >  #ifndef __ASSEMBLY__
> >
> >  #include <linux/build_bug.h>
> > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> > index 125a10e413e9..c85eb9bd5a37 100644
> > --- a/arch/arm64/kernel/mte.c
> > +++ b/arch/arm64/kernel/mte.c
> > @@ -216,15 +209,34 @@ void mte_thread_init_user(void)
> >       dsb(ish);
> >       write_sysreg_s(0, SYS_TFSRE0_EL1);
> >       clear_thread_flag(TIF_MTE_ASYNC_FAULT);
> > -     /* disable tag checking */
> > -     set_task_sctlr_el1((current->thread.sctlr_user & ~SCTLR_EL1_TCF0_MASK) |
> > -                        SCTLR_EL1_TCF0_NONE);
> > -     /* reset tag generation mask */
> > -     set_gcr_el1_excl(SYS_GCR_EL1_EXCL_MASK);
> > +     /* disable tag checking and reset tag generation mask */
> > +     current->thread.mte_ctrl =
> > +             MTE_CTRL_GCR_USER_EXCL_MASK | MTE_CTRL_TCF_NONE;
>
> We can drop MTE_CTRL_TCF_NONE here. If no async or sync mode is implied,
> we assume no tag checking without an explicit 'none'.

Done.

> > +     mte_update_sctlr_user(current);
> > +     set_task_sctlr_el1(current->thread.sctlr_user);
> > +}
> > +
> > +void mte_update_sctlr_user(struct task_struct *task)
> > +{
> > +     unsigned long sctlr = task->thread.sctlr_user;
> > +     unsigned long pref = __this_cpu_read(mte_tcf_preferred);
>
> We need to ensure this is called in a non-preemptible context. I don't
> think the prctl() can guarantee that, so we get some weird behaviour if
> it gets context switched mid-way through this. The callback from
> mte_thread_switch() is fine and I think smp_call_function_single() does
> a get_cpu() already.
>
> I'd also add a function comment that it can only be called on the
> current or next task otherwise since the CPU must match where the thread
> is going to run.

Done.

> Maybe easier to just pass the CPU number in addition to task. In most
> cases it is smp_processor_id() while for prctl() we'd do a
> get_cpu/put_cpu(). This would be cleaner unless this_cpu_read() is
> (marginally) faster than per_cpu().

It seems like the simplest thing to do would be to add
preempt_disable()/preempt_enable() to this function, so that's what
I've done.

> > +     unsigned long mte_ctrl = task->thread.mte_ctrl;
> > +     unsigned long resolved_mte_tcf = (mte_ctrl & pref) ? pref : mte_ctrl;
> > +
> > +     sctlr &= ~SCTLR_EL1_TCF0_MASK;
> > +     if (resolved_mte_tcf & MTE_CTRL_TCF_NONE)
> > +             sctlr |= SCTLR_EL1_TCF0_NONE;
>
> Same comment on dropping MTE_CTRL_TCF_NONE.

Done.

> > +     else if (resolved_mte_tcf & MTE_CTRL_TCF_ASYNC)
> > +             sctlr |= SCTLR_EL1_TCF0_ASYNC;
> > +     else if (resolved_mte_tcf & MTE_CTRL_TCF_SYNC)
> > +             sctlr |= SCTLR_EL1_TCF0_SYNC;
> > +     task->thread.sctlr_user = sctlr;
> >  }
> >
> >  void mte_thread_switch(struct task_struct *next)
> >  {
> > +     mte_update_sctlr_user(next);
> > +
> >       /*
> >        * Check if an async tag exception occurred at EL1.
> >        *
> > @@ -262,33 +274,24 @@ void mte_suspend_exit(void)
> >
> >  long set_mte_ctrl(struct task_struct *task, unsigned long arg)
> >  {
> > -     u64 sctlr = task->thread.sctlr_user & ~SCTLR_EL1_TCF0_MASK;
> > -     u64 gcr_excl = ~((arg & PR_MTE_TAG_MASK) >> PR_MTE_TAG_SHIFT) &
> > -                    SYS_GCR_EL1_EXCL_MASK;
> > +     u64 mte_ctrl = (~((arg & PR_MTE_TAG_MASK) >> PR_MTE_TAG_SHIFT) &
> > +                     SYS_GCR_EL1_EXCL_MASK)
> > +                    << MTE_CTRL_GCR_USER_EXCL_SHIFT;
>
> Nit: just move the << MTE_CTRL_GCR_USER_EXCL_SHIFT to the previous line
> (it gets to about 80 characters, I don't have a strict limit).

Done. It looks like it's 79 columns in the end, not sure why
clang-format puts it on a separate line.

> >       if (!system_supports_mte())
> >               return 0;
> >
> > -     switch (arg & PR_MTE_TCF_MASK) {
> > -     case PR_MTE_TCF_NONE:
> > -             sctlr |= SCTLR_EL1_TCF0_NONE;
> > -             break;
> > -     case PR_MTE_TCF_SYNC:
> > -             sctlr |= SCTLR_EL1_TCF0_SYNC;
> > -             break;
> > -     case PR_MTE_TCF_ASYNC:
> > -             sctlr |= SCTLR_EL1_TCF0_ASYNC;
> > -             break;
> > -     default:
> > -             return -EINVAL;
> > -     }
> > -
> > -     if (task != current) {
> > -             task->thread.sctlr_user = sctlr;
> > -             task->thread.gcr_user_excl = gcr_excl;
> > -     } else {
> > -             set_task_sctlr_el1(sctlr);
> > -             set_gcr_el1_excl(gcr_excl);
> > +     if ((arg & PR_MTE_TCF_MASK) == PR_MTE_TCF_NONE)
> > +             mte_ctrl |= MTE_CTRL_TCF_NONE;
> > +     if (arg & PR_MTE_TCF_ASYNC)
> > +             mte_ctrl |= MTE_CTRL_TCF_ASYNC;
> > +     if (arg & PR_MTE_TCF_SYNC)
> > +             mte_ctrl |= MTE_CTRL_TCF_SYNC;
> > +
> > +     task->thread.mte_ctrl = mte_ctrl;
> > +     if (task == current) {
> > +             mte_update_sctlr_user(task);
> > +             set_task_sctlr_el1(task->thread.sctlr_user);
>
> Here I don't think we guarantee the preemption being disabled.
>
> > diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> > index 18a9f59dc067..48de354b847f 100644
> > --- a/include/uapi/linux/prctl.h
> > +++ b/include/uapi/linux/prctl.h
> > @@ -234,11 +234,10 @@ struct prctl_mm_map {
> >  #define PR_GET_TAGGED_ADDR_CTRL              56
> >  # define PR_TAGGED_ADDR_ENABLE               (1UL << 0)
> >  /* MTE tag check fault modes */
> > -# define PR_MTE_TCF_SHIFT            1
> > -# define PR_MTE_TCF_NONE             (0UL << PR_MTE_TCF_SHIFT)
> > -# define PR_MTE_TCF_SYNC             (1UL << PR_MTE_TCF_SHIFT)
> > -# define PR_MTE_TCF_ASYNC            (2UL << PR_MTE_TCF_SHIFT)
> > -# define PR_MTE_TCF_MASK             (3UL << PR_MTE_TCF_SHIFT)
> > +# define PR_MTE_TCF_NONE             0
> > +# define PR_MTE_TCF_SYNC             (1UL << 1)
> > +# define PR_MTE_TCF_ASYNC            (1UL << 2)
> > +# define PR_MTE_TCF_MASK             (PR_MTE_TCF_SYNC | PR_MTE_TCF_ASYNC)
>
> I'm fine with redefining these but removing PR_MTE_TCF_SHIFT may break
> some current user-space compilation. It does look weird to keep it
> around though.

Okay, I brought back PR_MTE_TCF_SHIFT with a comment to say that it's unused.

> Anyway, I think the patch looks alright overall but I'd split it in 3-4
> small patches for easier reviewing. Something like:
>
> 0. Cover letter ;)
>
> 1. Rename gcr_user_excl to mte_ctrl
>
> 2. Use individual bits for TCF sync/async and allow combinations. This
>    would update the uapi prctl.h. On its own, this patch will have to
>    choose a preferred mode since we don't have the sysfs interface yet.
>
> 3. Introduce the sysfs interface and the preferred mode selection
>
> 4. Update documentation

Done.

Peter



More information about the linux-arm-kernel mailing list