[PATCH v4] arm64: mte: allow async MTE to be upgraded to sync on a per-CPU basis

Peter Collingbourne pcc at google.com
Tue Jun 15 13:06:25 PDT 2021


On Tue, Jun 15, 2021 at 11:02 AM Catalin Marinas
<catalin.marinas at arm.com> wrote:
>
> On Mon, Jun 14, 2021 at 06:57:28PM -0700, Peter Collingbourne wrote:
> > @@ -120,6 +120,25 @@ in the ``PR_MTE_TAG_MASK`` bit-field.
> >  interface provides an include mask. An include mask of ``0`` (exclusion
> >  mask ``0xffff``) results in the CPU always generating tag ``0``.
> >
> > +Upgrading to stricter tag checking modes
> > +----------------------------------------
> > +
> > +On some CPUs the performance of MTE in stricter tag checking modes
> > +is similar to that of less strict tag checking modes. This makes it
> > +worthwhile to enable stricter checks on those CPUs when a less strict
> > +checking mode is requested, in order to gain the error detection
> > +benefits of the stricter checks without the performance downsides. To
> > +opt into upgrading to a stricter checking mode on those CPUs, the user
> > +can set the ``PR_MTE_DYNAMIC_TCF`` flag bit in the ``flags`` argument
> > +to the ``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call.
> > +
> > +This feature is currently only supported for upgrading from
> > +asynchronous mode. To configure a CPU to upgrade from asynchronous mode
> > +to synchronous mode, a privileged user may write the value ``1`` to
> > +``/sys/devices/system/cpu/cpu<N>/mte_upgrade_async``, and to disable
> > +upgrading they may write the value ``2``. By default the feature is
> > +disabled on all CPUs.
>
> This needs updated as well to for 0 as disabled.

Will fix.

> I wonder whether we could generalise this to something like
> mte_tcf_upgrade and allow asymmetric to be expanded to sync. Otherwise
> we'd have to add another interface when we know that if a CPU can handle
> sync as fast as async, the asymmetric mode should also be upgraded. So a
> more generic mte_tcf_upgrade just holds the strictest that the CPU can
> handle without significant performance degradation.

Maybe. But that wouldn't accommodate all scenarios. E.g. what if asym
should be upgraded to sync but async should not be upgraded? Or if we
wanted to allow upgrading from none? If we wanted to expand the
interface you've described to one where we allow these upgrade
patterns, I think things could get confusing. The way that things work
now we should be able to extend to allow these to be configured
uniformly.

Since the system designer knows which modes are supported on a
particular system, it doesn't seem like it would be a big concern to
require all of them to be configured.

> The mte_upgrade_async can be confusing as well for the asymmetric mode
> where the writes are asynchronous.

It seems equivalent to e.g. the TCF constants being named async, asym, sync.

> > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> > index 125a10e413e9..ad85e8519669 100644
> > --- a/arch/arm64/kernel/mte.c
> > +++ b/arch/arm64/kernel/mte.c
> > @@ -4,6 +4,7 @@
> >   */
> >
> >  #include <linux/bitops.h>
> > +#include <linux/cpu.h>
> >  #include <linux/kernel.h>
> >  #include <linux/mm.h>
> >  #include <linux/prctl.h>
> > @@ -26,6 +27,9 @@ u64 gcr_kernel_excl __ro_after_init;
> >
> >  static bool report_fault_once = true;
> >
> > +DEFINE_PER_CPU_READ_MOSTLY(u64, mte_upgrade_async);
> > +EXPORT_PER_CPU_SYMBOL(mte_upgrade_async);
>
> I think this should be static and not exported, unless I missed its use
> elsewhere.

Will fix.

> > @@ -216,15 +210,33 @@ void mte_thread_init_user(void)
> >       dsb(ish);
> >       write_sysreg_s(0, SYS_TFSRE0_EL1);
> >       clear_thread_flag(TIF_MTE_ASYNC_FAULT);
> > -     /* disable tag checking */
> > -     set_task_sctlr_el1((current->thread.sctlr_user & ~SCTLR_EL1_TCF0_MASK) |
> > -                        SCTLR_EL1_TCF0_NONE);
> > -     /* reset tag generation mask */
> > -     set_gcr_el1_excl(SYS_GCR_EL1_EXCL_MASK);
> > +     /* disable tag checking and reset tag generation mask */
> > +     current->thread.mte_ctrl =
> > +             MTE_CTRL_GCR_USER_EXCL_MASK | MTE_CTRL_TCF_NONE;
> > +     mte_update_sctlr_user(current);
> > +     set_task_sctlr_el1(current->thread.sctlr_user);
> > +}
> > +
> > +void mte_update_sctlr_user(struct task_struct *task)
> > +{
> > +     unsigned long sctlr = task->thread.sctlr_user;
> > +
> > +     sctlr &= ~SCTLR_EL1_TCF0_MASK;
> > +     if ((task->thread.mte_ctrl & MTE_CTRL_DYNAMIC_TCF) &&
> > +         (task->thread.mte_ctrl & MTE_CTRL_TCF_MASK) == MTE_CTRL_TCF_ASYNC) {
> > +             sctlr |= __this_cpu_read(mte_upgrade_async);
>
> If we consider 0 to mean "disable upgrade", you'd just need another
> check here before the write. But it may simplify some of the sysfs code
> to avoid the switch statement and the pre-initialisation of
> mte_upgrade_async.

The way that I've written it will accommodate upgrading to asym more
easily. With that, either way we will need a switch statement
somewhere and it seems best to have it on the sysfs side since that
code will be invoked less often.

> > +     } else {
> > +             sctlr |= ((task->thread.mte_ctrl & MTE_CTRL_TCF_MASK) >>
> > +                       MTE_CTRL_TCF_SHIFT)
> > +                      << SCTLR_EL1_TCF0_SHIFT;
>
> Nitpick: we tend to place the operator on the previous line you can
> probably place the shift constant on that line as well.

Will fix.

> > +     }
> > +     task->thread.sctlr_user = sctlr;
>
> I think on the "else" path, we shouldn't bother updating sctlr_user,
> though it probably needs some tweaking of the prctl() path.

I thought about that but it didn't seem like it would justify the
added complexity on the prctl path.

Peter



More information about the linux-arm-kernel mailing list