[PATCH v5] arm64: mte: allow async MTE to be upgraded to sync on a per-CPU basis
Peter Collingbourne
pcc at google.com
Thu Jun 17 15:13:57 PDT 2021
On Thu, Jun 17, 2021 at 2:58 PM Will Deacon <will at kernel.org> wrote:
>
> Hi Peter,
>
> On Tue, Jun 15, 2021 at 01:38:07PM -0700, Peter Collingbourne wrote:
> > On some CPUs the performance of MTE in synchronous mode is similar
> > to that of asynchronous mode. This makes it worthwhile to enable
> > synchronous mode on those CPUs when asynchronous mode is requested,
> > in order to gain the error detection benefits of synchronous mode
> > without the performance downsides. Therefore, make it possible for
> > user programs to opt into upgrading to synchronous mode on those CPUs
> > via a new prctl flag. The flag is orthogonal to the existing TCF modes
> > in order to accommodate upgrading from other TCF modes in the future.
> >
> > The feature is controlled on a per-CPU basis via sysfs.
> >
> > Signed-off-by: Peter Collingbourne <pcc at google.com>
> > Link: https://linux-review.googlesource.com/id/Id6f95b71fde6e701dd30b5e108126af7286147e8
> > ---
> > v5:
> > - updated documentation
> > - address some nits in mte.c
> >
> > v4:
> > - switch to new mte_ctrl field
> > - make register_mte_upgrade_async_sysctl return an int
> > - change the sysctl to take 0 or 1 instead of raw TCF values
> > - "same as" -> "similar to"
> >
> > v3:
> > - drop the device tree support
> > - add documentation
> > - add static_assert to ensure no overlap with real HW bits
> > - move per-CPU variable initialization to mte.c
> > - use smp_call_function_single instead of stop_machine
> >
> > v2:
> > - make it an opt-in behavior
> > - change the format of the device tree node
> > - also allow controlling the feature via sysfs
> >
> > .../arm64/memory-tagging-extension.rst | 20 +++
> > arch/arm64/include/asm/mte.h | 4 +
> > arch/arm64/include/asm/processor.h | 14 +-
> > arch/arm64/kernel/asm-offsets.c | 2 +-
> > arch/arm64/kernel/entry.S | 4 +-
> > arch/arm64/kernel/mte.c | 151 ++++++++++++++----
> > arch/arm64/kernel/process.c | 2 +-
> > include/uapi/linux/prctl.h | 2 +
> > 8 files changed, 162 insertions(+), 37 deletions(-)
> >
> > diff --git a/Documentation/arm64/memory-tagging-extension.rst b/Documentation/arm64/memory-tagging-extension.rst
> > index b540178a93f8..5375d5efd18c 100644
> > --- a/Documentation/arm64/memory-tagging-extension.rst
> > +++ b/Documentation/arm64/memory-tagging-extension.rst
> > @@ -120,6 +120,25 @@ in the ``PR_MTE_TAG_MASK`` bit-field.
> > interface provides an include mask. An include mask of ``0`` (exclusion
> > mask ``0xffff``) results in the CPU always generating tag ``0``.
> >
> > +Upgrading to stricter tag checking modes
> > +----------------------------------------
> > +
> > +On some CPUs the performance of MTE in stricter tag checking modes
> > +is similar to that of less strict tag checking modes. This makes it
> > +worthwhile to enable stricter checks on those CPUs when a less strict
> > +checking mode is requested, in order to gain the error detection
> > +benefits of the stricter checks without the performance downsides. To
> > +opt into upgrading to a stricter checking mode on those CPUs, the user
> > +can set the ``PR_MTE_DYNAMIC_TCF`` flag bit in the ``flags`` argument
> > +to the ``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call.
> > +
> > +This feature is currently only supported for upgrading from
> > +asynchronous mode. To configure a CPU to upgrade from asynchronous mode
> > +to synchronous mode, a privileged user may write the value ``1`` to
> > +``/sys/devices/system/cpu/cpu<N>/mte_upgrade_async``, and to disable
> > +upgrading they may write the value ``0``. By default the feature is
> > +disabled on all CPUs.
> > +
> > Initial process state
> > ---------------------
> >
> > @@ -128,6 +147,7 @@ On ``execve()``, the new process has the following configuration:
> > - ``PR_TAGGED_ADDR_ENABLE`` set to 0 (disabled)
> > - Tag checking mode set to ``PR_MTE_TCF_NONE``
> > - ``PR_MTE_TAG_MASK`` set to 0 (all tags excluded)
> > +- ``PR_MTE_DYNAMIC_TCF`` set to 0 (disabled)
> > - ``PSTATE.TCO`` set to 0
> > - ``PROT_MTE`` not set on any of the initial memory maps
>
> Something about this doesn't sit right with me, as we're mixing a per-task
> interface with a per-cpu interface for selecting async/sync MTE and the
> priorities are somewhat confusing.
>
> I think a better interface would be for the sysfs entry for each CPU to
> allow selection between:
>
> task : Honour the prctl() (current behaviour)
> async : Force async for tasks using MTE
> sync : Force sync for tasks using MTE
> none : MTE disabled
>
> i.e. the per-cpu setting is an override.
>
> Would that give you what you need?
The approach in v1 of the patch [1] was that the per-CPU setting (at
that time a DT attribute) unconditionally overrode the TCF setting
provided by the user, so in that respect it's similar to what you
proposed, however Catalin and Vincenzo considered it to be an ABI
break, which I don't 100% agree with, but I think it's a fair enough
criticism.
I also don't think the setting you've mentioned provides enough
flexibility, especially when asymmetric support is added [2], and in
some cases can force a *downgrade* of the TCF setting to a weaker one,
which doesn't seem desirable. For example sync -> async weakens
security and async/sync -> none does the same as well as being more
clearly an ABI break.
Peter
[1] https://lore.kernel.org/linux-arm-kernel/20210602232445.3829248-1-pcc@google.com/
[2] https://lore.kernel.org/linux-arm-kernel/CAMn1gO7b0qMYMFz45eKdjNQV24V9YH9nqDgUpSbX5WJfkaJzCg@mail.gmail.com/
More information about the linux-arm-kernel
mailing list