[PATCH v7] arm64: mte: allow async MTE to be upgraded to sync on a per-CPU basis

Catalin Marinas catalin.marinas at arm.com
Wed Jun 30 10:34:09 PDT 2021


On Tue, Jun 29, 2021 at 04:57:04PM -0700, Peter Collingbourne wrote:
> diff --git a/Documentation/arm64/memory-tagging-extension.rst b/Documentation/arm64/memory-tagging-extension.rst
> index b540178a93f8..39c04541f57c 100644
> --- a/Documentation/arm64/memory-tagging-extension.rst
> +++ b/Documentation/arm64/memory-tagging-extension.rst
> @@ -77,14 +77,18 @@ configurable behaviours:
>    address is unknown).
>  
>  The user can select the above modes, per thread, using the
> -``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where
> -``flags`` contain one of the following values in the ``PR_MTE_TCF_MASK``
> +``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where ``flags``
> +contains any number of the following values in the ``PR_MTE_TCF_MASK``
>  bit-field:
>  
> -- ``PR_MTE_TCF_NONE``  - *Ignore* tag check faults

I'd keep this and maybe add a comment that it will be ignored if
combined with any other options. IOW, it's already user API and I'd keep
it together with all the definition in prctl.h (you haven't removed it).

>  - ``PR_MTE_TCF_SYNC``  - *Synchronous* tag check fault mode
>  - ``PR_MTE_TCF_ASYNC`` - *Asynchronous* tag check fault mode
>  
> +If no modes are specified, tag check faults are ignored. If a single
> +mode is specified, the program will run in that mode. If multiple
> +modes are specified, the mode is selected as described in the "Per-CPU
> +preferred tag checking modes" section below.
> +
>  The current tag check fault mode can be read using the
>  ``prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0)`` system call.
>  
> @@ -120,13 +124,37 @@ in the ``PR_MTE_TAG_MASK`` bit-field.
>  interface provides an include mask. An include mask of ``0`` (exclusion
>  mask ``0xffff``) results in the CPU always generating tag ``0``.
>  
> +Per-CPU preferred tag checking mode
> +-----------------------------------
> +
> +On some CPUs the performance of MTE in stricter tag checking modes
> +is similar to that of less strict tag checking modes. This makes it
> +worthwhile to enable stricter checks on those CPUs when a less strict
> +checking mode is requested, in order to gain the error detection
> +benefits of the stricter checks without the performance downsides. To
> +support this scenario, a privileged user may configure a stricter
> +tag checking mode as the CPU's preferred tag checking mode.
> +
> +The preferred tag checking mode for each CPU is controlled by
> +``/sys/devices/system/cpu/cpu<N>/mte_tcf_preferred``, to which a
> +privileged user may write the value ``async`` or ``sync``.  The default
> +preferred mode for each CPU is ``async``.
> +
> +To allow a program to potentially run in the CPU's preferred tag
> +checking mode, the user program may set multiple tag check fault mode
> +bits in the ``flags`` argument to the ``prctl(PR_SET_TAGGED_ADDR_CTRL,
> +flags, 0, 0, 0)`` system call. If the CPU's preferred tag checking mode
> +is in the task's set of provided tag checking modes, that mode will
> +be selected. Otherwise, the least strict mode in the set is selected,
> +where ``async`` is considered less strict than ``sync``.

Specifying async vs sync order at this stage probably doesn't bring
much, though I don't mind as we'll add asym at some point. We already
have the default "async" in the mte_tcf_preferred, so async will be
selected if an app selects both. Also the intersection of task mode and
mte_tcf_preferred is only void if a single mode is selected by the user,
in which that case that mode would have to be honoured.

Where it gets more interesting is when we add the asym mode and current
applications only set sync|async. I expect CPUs that implement sync
sufficiently fast will continue with this preferred mode. The others may
likely want asym as a middle ground. With the bit-based order, sync will
be selected.

We might as well leave this debate until we have asym support since the
order doesn't matter with only two modes.

> +
>  Initial process state
>  ---------------------
>  
>  On ``execve()``, the new process has the following configuration:
>  
>  - ``PR_TAGGED_ADDR_ENABLE`` set to 0 (disabled)
> -- Tag checking mode set to ``PR_MTE_TCF_NONE``
> +- No tag checking modes are selected (tag check faults ignored)
>  - ``PR_MTE_TAG_MASK`` set to 0 (all tags excluded)
>  - ``PSTATE.TCO`` set to 0
>  - ``PROT_MTE`` not set on any of the initial memory maps

I think we should also update the example to show the or'ing of the
modes into the prctl() flags.

> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 9df3feeee890..6ad7b6c188f4 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -16,6 +16,13 @@
>   */
>  #define NET_IP_ALIGN	0
>  
> +#define MTE_CTRL_GCR_USER_EXCL_SHIFT	0
> +#define MTE_CTRL_GCR_USER_EXCL_MASK	0xffff
> +
> +#define MTE_CTRL_TCF_NONE		(1UL << 16)

Do we need this bit for none? I think we can get rid of it and simplify
the if/else blocks slightly.

> +#define MTE_CTRL_TCF_SYNC		(1UL << 17)
> +#define MTE_CTRL_TCF_ASYNC		(1UL << 18)
> +
>  #ifndef __ASSEMBLY__
>  
>  #include <linux/build_bug.h>
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index 125a10e413e9..c85eb9bd5a37 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -216,15 +209,34 @@ void mte_thread_init_user(void)
>  	dsb(ish);
>  	write_sysreg_s(0, SYS_TFSRE0_EL1);
>  	clear_thread_flag(TIF_MTE_ASYNC_FAULT);
> -	/* disable tag checking */
> -	set_task_sctlr_el1((current->thread.sctlr_user & ~SCTLR_EL1_TCF0_MASK) |
> -			   SCTLR_EL1_TCF0_NONE);
> -	/* reset tag generation mask */
> -	set_gcr_el1_excl(SYS_GCR_EL1_EXCL_MASK);
> +	/* disable tag checking and reset tag generation mask */
> +	current->thread.mte_ctrl =
> +		MTE_CTRL_GCR_USER_EXCL_MASK | MTE_CTRL_TCF_NONE;

We can drop MTE_CTRL_TCF_NONE here. If no async or sync mode is implied,
we assume no tag checking without an explicit 'none'.

> +	mte_update_sctlr_user(current);
> +	set_task_sctlr_el1(current->thread.sctlr_user);
> +}
> +
> +void mte_update_sctlr_user(struct task_struct *task)
> +{
> +	unsigned long sctlr = task->thread.sctlr_user;
> +	unsigned long pref = __this_cpu_read(mte_tcf_preferred);

We need to ensure this is called in a non-preemptible context. I don't
think the prctl() can guarantee that, so we get some weird behaviour if
it gets context switched mid-way through this. The callback from
mte_thread_switch() is fine and I think smp_call_function_single() does
a get_cpu() already.

I'd also add a function comment that it can only be called on the
current or next task otherwise since the CPU must match where the thread
is going to run.

Maybe easier to just pass the CPU number in addition to task. In most
cases it is smp_processor_id() while for prctl() we'd do a
get_cpu/put_cpu(). This would be cleaner unless this_cpu_read() is
(marginally) faster than per_cpu().

> +	unsigned long mte_ctrl = task->thread.mte_ctrl;
> +	unsigned long resolved_mte_tcf = (mte_ctrl & pref) ? pref : mte_ctrl;
> +
> +	sctlr &= ~SCTLR_EL1_TCF0_MASK;
> +	if (resolved_mte_tcf & MTE_CTRL_TCF_NONE)
> +		sctlr |= SCTLR_EL1_TCF0_NONE;

Same comment on dropping MTE_CTRL_TCF_NONE.

> +	else if (resolved_mte_tcf & MTE_CTRL_TCF_ASYNC)
> +		sctlr |= SCTLR_EL1_TCF0_ASYNC;
> +	else if (resolved_mte_tcf & MTE_CTRL_TCF_SYNC)
> +		sctlr |= SCTLR_EL1_TCF0_SYNC;
> +	task->thread.sctlr_user = sctlr;
>  }
>  
>  void mte_thread_switch(struct task_struct *next)
>  {
> +	mte_update_sctlr_user(next);
> +
>  	/*
>  	 * Check if an async tag exception occurred at EL1.
>  	 *
> @@ -262,33 +274,24 @@ void mte_suspend_exit(void)
>  
>  long set_mte_ctrl(struct task_struct *task, unsigned long arg)
>  {
> -	u64 sctlr = task->thread.sctlr_user & ~SCTLR_EL1_TCF0_MASK;
> -	u64 gcr_excl = ~((arg & PR_MTE_TAG_MASK) >> PR_MTE_TAG_SHIFT) &
> -		       SYS_GCR_EL1_EXCL_MASK;
> +	u64 mte_ctrl = (~((arg & PR_MTE_TAG_MASK) >> PR_MTE_TAG_SHIFT) &
> +			SYS_GCR_EL1_EXCL_MASK)
> +		       << MTE_CTRL_GCR_USER_EXCL_SHIFT;

Nit: just move the << MTE_CTRL_GCR_USER_EXCL_SHIFT to the previous line
(it gets to about 80 characters, I don't have a strict limit).

>  	if (!system_supports_mte())
>  		return 0;
>  
> -	switch (arg & PR_MTE_TCF_MASK) {
> -	case PR_MTE_TCF_NONE:
> -		sctlr |= SCTLR_EL1_TCF0_NONE;
> -		break;
> -	case PR_MTE_TCF_SYNC:
> -		sctlr |= SCTLR_EL1_TCF0_SYNC;
> -		break;
> -	case PR_MTE_TCF_ASYNC:
> -		sctlr |= SCTLR_EL1_TCF0_ASYNC;
> -		break;
> -	default:
> -		return -EINVAL;
> -	}
> -
> -	if (task != current) {
> -		task->thread.sctlr_user = sctlr;
> -		task->thread.gcr_user_excl = gcr_excl;
> -	} else {
> -		set_task_sctlr_el1(sctlr);
> -		set_gcr_el1_excl(gcr_excl);
> +	if ((arg & PR_MTE_TCF_MASK) == PR_MTE_TCF_NONE)
> +		mte_ctrl |= MTE_CTRL_TCF_NONE;
> +	if (arg & PR_MTE_TCF_ASYNC)
> +		mte_ctrl |= MTE_CTRL_TCF_ASYNC;
> +	if (arg & PR_MTE_TCF_SYNC)
> +		mte_ctrl |= MTE_CTRL_TCF_SYNC;
> +
> +	task->thread.mte_ctrl = mte_ctrl;
> +	if (task == current) {
> +		mte_update_sctlr_user(task);
> +		set_task_sctlr_el1(task->thread.sctlr_user);

Here I don't think we guarantee the preemption being disabled.

> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index 18a9f59dc067..48de354b847f 100644
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -234,11 +234,10 @@ struct prctl_mm_map {
>  #define PR_GET_TAGGED_ADDR_CTRL		56
>  # define PR_TAGGED_ADDR_ENABLE		(1UL << 0)
>  /* MTE tag check fault modes */
> -# define PR_MTE_TCF_SHIFT		1
> -# define PR_MTE_TCF_NONE		(0UL << PR_MTE_TCF_SHIFT)
> -# define PR_MTE_TCF_SYNC		(1UL << PR_MTE_TCF_SHIFT)
> -# define PR_MTE_TCF_ASYNC		(2UL << PR_MTE_TCF_SHIFT)
> -# define PR_MTE_TCF_MASK		(3UL << PR_MTE_TCF_SHIFT)
> +# define PR_MTE_TCF_NONE		0
> +# define PR_MTE_TCF_SYNC		(1UL << 1)
> +# define PR_MTE_TCF_ASYNC		(1UL << 2)
> +# define PR_MTE_TCF_MASK		(PR_MTE_TCF_SYNC | PR_MTE_TCF_ASYNC)

I'm fine with redefining these but removing PR_MTE_TCF_SHIFT may break
some current user-space compilation. It does look weird to keep it
around though.

Anyway, I think the patch looks alright overall but I'd split it in 3-4
small patches for easier reviewing. Something like:

0. Cover letter ;)

1. Rename gcr_user_excl to mte_ctrl

2. Use individual bits for TCF sync/async and allow combinations. This
   would update the uapi prctl.h. On its own, this patch will have to
   choose a preferred mode since we don't have the sysfs interface yet.

3. Introduce the sysfs interface and the preferred mode selection

4. Update documentation

Thanks.

-- 
Catalin



More information about the linux-arm-kernel mailing list