[PATCH v5] arm64: mte: allow async MTE to be upgraded to sync on a per-CPU basis
Catalin Marinas
catalin.marinas at arm.com
Thu Jun 24 09:52:28 PDT 2021
On Wed, Jun 23, 2021 at 09:55:30AM +0100, Szabolcs Nagy wrote:
> The 06/22/2021 11:37, Peter Collingbourne wrote:
> > On Mon, Jun 21, 2021 at 11:50 AM Catalin Marinas
> > <catalin.marinas at arm.com> wrote:
> > > On Mon, Jun 21, 2021 at 06:39:02PM +0100, Will Deacon wrote:
> > > > On Mon, Jun 21, 2021 at 04:18:59PM +0100, Catalin Marinas wrote:
> > > > > Given that there are no real users of MTE yet, we have some choice of
> > > > > tweaking the ABI, backporting to 5.10. The question: is the expectation
> > > > > that the sysfs forcing of TCF is limited to deployments where the user
> > > > > space is tightly controlled (e.g. Android with most apps starting from
> > > > > zygote) or we allow it to become more of a general hint of what's the
> > > > > fastest check on a CPU? If the former, I'm fine with forcing without any
> > > > > additional bit, though I'd still prefer the opt-in. For the latter, I'd
> > > > > like some wider discussion with non-Android folk on what they expect
> > > > > from the TCF setting. Otherwise simply using PROT_MTE would may lead to
> > > > > tag check faults.
> > > >
> > > > I don't think there's anything Android-specific here. The problem being
> > > > solved concerns big/little SoCs with MTE, and I think it's up to the
> > > > distribution how the sysfs stuff is used.
> > >
> > > But distros don't control what applications are running, so most likely
> > > they would disable the sysfs control entirely. At that point, the
> > > feature becomes primarily an Android play.
> > >
> > > Anyway, I'm not dead against forcing of the TCF mode regardless of the
> > > user choice but I'd like to ensure that we don't miss other use-cases or
> > > that we don't make the sysfs control an expert-only feature.
> > >
> > > Adding Szabolcs to get a view from the glibc perspective.
>
> Adding Tejas as he will look at memory tagging in glibc.
Thanks. Is there any MTE support in mainline glibc? If not, we may have
another chance of adjusting the ABI.
> > Given these diverging opinions my view is that we should choose
> > whichever option leaves our options open for the future. For example,
> > imagine that we make the ABI change now such that upgrades may happen
> > for all applications and we don't have PR_MTE_DYNAMIC_TCF. This means
> > that applications no longer have a guarantee of their TCF mode which
> > may preclude some use cases (if we add an opt out later, applications
> > will be affected when running on the kernel versions between when we
> > changed the ABI and when we added the opt out). On the other hand, if
> > we introduce PR_MTE_DYNAMIC_TCF, we can always make the ABI change
> > later and start ignoring the PR_MTE_DYNAMIC_TCF flag at that point.
> >
> > Maybe the best compromise would be to change the ABI and at the same
> > time add the opt out, but I don't have a strong opinion.
>
> so the observable difference between async mode and async mode
> upgraded to sync mode is that async mode allows to ignoring the fault
> and things can continue, while in sync mode the program cannot move
> forward in case of a fault since the pc is still at the faulting
> instruction?
Yes. Would an application find the async mode useful or it can be freely
overridden by the kernel (well, a user via sysfs)?
> may be we can have a mode where the cpu is in sync mode checks but the
> kernel steps over the faulting instruction before reporting it? so
> emulating async semantics (in a slow and complicated way..), but
> guaranteeing (almost) immediate faults for better debugging/security.
It's a pretty complex step over (or emulate), so I wouldn't go there.
The user signal handler could disable tag checking altogether
(PSTATE.TCO) and continue.
> personally i don't see a big issue with a mode that says "either sync
> or async check behaviour" and user code has to deal with it, however..
The question is more about whether we still want to keep the current
user program choice of sync vs async (vs the new asymmetric mode in
8.7). If the user wouldn't care, we just override it from the kernel
without any additional PR_ flag for opting in (or out).
> the per cpu setting is a bit nasty: can the kernel decide which cpu
> should use sync and which async? or a privileged user will have to
> fiddle with sysfs settings on every system to make this useful?
The proposed interface is sysfs. I think that's not relevant to the user
application since it wouldn't have control over it anyway. What's
visible is that it cannot rely on the mode it requested, not even for
the lifetime of a thread (as it may migrate between CPUs). Do you see
any issues with this? For Android, it's probably fine but if other
programs cannot cope (or need the specific mode requested), we'd need a
new control (for opt-in or opt-out).
--
Catalin
More information about the linux-arm-kernel
mailing list