[PATCH v1 4/4] arm64/mte: Add userspace interface for enabling asymmetric mode

Peter Collingbourne pcc at google.com
Mon Mar 7 12:55:28 PST 2022


On Mon, Mar 7, 2022 at 7:36 AM Catalin Marinas <catalin.marinas at arm.com> wrote:
>
> On Fri, Mar 04, 2022 at 01:09:00PM -0800, Peter Collingbourne wrote:
> > On Thu, Mar 3, 2022 at 2:41 AM Catalin Marinas <catalin.marinas at arm.com> wrote:
> > > On Wed, Mar 02, 2022 at 12:58:48PM -0800, Evgenii Stepanov wrote:
> > > > On Wed, Mar 2, 2022 at 11:33 AM Mark Brown <broonie at kernel.org> wrote:
> > > > > On Wed, Mar 02, 2022 at 10:44:31AM -0800, Evgenii Stepanov wrote:
> > > > > > On Wed, Mar 2, 2022 at 5:10 AM Mark Brown <broonie at kernel.org> wrote:
> > > > > > > On Wed, Mar 02, 2022 at 11:44:53AM +0000, Catalin Marinas wrote:
> > > > > > > > On Tue, Mar 01, 2022 at 04:52:01PM -0800, Evgenii Stepanov wrote:
> > > > >
> > > > > > > > > Extending PR_MTE_TCF_MASK seems bad for backward compatibility. User
> > > > > > > > > code may do "flags =& ~PR_MTE_TCF_MASK" to disable MTE; when compiled
> > > > > > > > > against an old version of the header this would fail to remove the ASYMM
> > > > > > > > > bit.
> > > > >
> > > > > > > > But if the app is compiled against an old version, it wouldn't set
> > > > > > > > MTE_CTRL_TCF_ASYMM either as it doesn't have the definition.
> > > > >
> > > > > > Libraries within a single process can be built against different
> > > > > > header versions. In our case, this is libc vs the app: we expect to
> > > > > > set all 3 mode bits when an app asks for "async" to enable the
> > > > > > mte_tcf_preferred logic. Even if the app is built against an older NDK
> > > > > > and unaware of the Asymm mode existence!
> > > > >
> > > > > I can't see how we can resolve that case in the kernel except by adding
> > > > > a specific call to disable all MTE modes which would obviously only be
> > > > > useful for future proofing given that no existing applications would
> > > > > support it.
> > > >
> > > > One option would be to introduce a new, future-proofed prctl with a
> > > > wider mask, and throw in a few extra reserved bits just in case. Then
> > > > have the legacy prctl always clear MTE_CTRL_TCF_ASYMM.
> > >
> > > If this problem is real, we can easily tweak the current proposal so the
> > > that ASYMM can only be set *if* both ASYNC and SYNC are set. IOW, it's
> > > not a specific mode an app requests on its own but rather something the
> > > kernel may choose via the preferred mode if the app opted in. We still
> > > introduce a new bit for ASYMM but not change the mask.
> >
> > As discussed out-of-band, I've never really liked this API style of
> > trying to cram everything into a single prctl(), not least because of
> > these compatibility concerns.
>
> I'd say the main problem is not necessarily the single prctl() which, at
> least initially, followed the hw control closely. The two main issues I
> see are: (1) unclear user-space ownership of such control and (2) the
> ABI change from single mode request to a mask of supported application
> modes which we did not foresee the need for when first adding MTE, nor
> did we see the potential problems with the asymmetric mode later.
>
> In Evgenii's use-case, it seems that it's not the libc owning the MTE
> controls but any other piece of code that may not be recompiled to the
> latest libc headers. We could argue that it's a user problem to solve
> and not that different from apps that use a non-zero tag or some pointer
> arithmetics and get confused by an MTE-aware allocator.

Yeah, one option we're considering is "do nothing" if none of the
possible solutions seem palatable, as so far this is only a
hypothetical problem and we could declare direct prctl() manipulation
from applications to be unsupported.

> Can any libc versioning help with making sure old apps are not linked
> against newer libc with such change? I guess this only works for symbols
> but here it's a macro.

Native binaries on Android have an ELF note indicating the version of
the headers that they were compiled against. So one option on Android
would be to use these ELF notes for controlling on a per-app basis
whether to set ASYMM.

> > Evgenii proposed a prctl() with a wider
> > mask, but I think this sort of approach just kicks the can down the
> > road (well, maybe for *this* feature we can't expect there to be too
> > many new modes added, but can we say the same for everything else in
> > tagged_addr_ctrl?).
>
> Extending the mask and adding a few more bits doesn't solve the original
> problem raised by Evgenii: applications compiled against current headers
> but (dynamically) linked against future libc could be confused if they
> manage the prctl() themselves.
>
> > And an attempt to guess the user intent from which
> > bits are set seems prone to failure and unnecessarily restrictive.
>
> Only allowing asymmetric mode if both symmetric and async are supported
> is not that confusing. The only downside is that one cannot ask for
> asymmetric mode only but is this such a big problem? For testing one can
> always set the sysfs preferred mode to asymm. It will be some time
> before we see asymm in production.

It could be a problem on Android because one of the modes that we
intend to expose to application developers is "asymm-only" (as opposed
to "upgradable asymm"). This would be used by developers who need a
finer grained control over the MTE mode used in their application.
Although to be honest I'm not 100% certain that we would need this
mode so perhaps we could live with just the "upgradable asymm" mode.

> > It
> > seems far more preferable to have a separate prctl() control for the
> > TCF bits and start working towards splitting out the other bits to
> > their own prctl()s. Then any user code that manipulates these bits
> > will "naturally" work. For compatibility the TCF bits would still be
> > accessible via the existing prctl(), but an attempt to set TCF via
> > this prctl() would clear the ASYMM bit.
>
> In hindsight, this would have been better. Right now we have to make
> sure we don't break the ABI and it's not only about setting the TCF but
> also getting the controls (prctl(PR_GET_TAGGED_ADDR_CTRL), ptrace()).
>
> Assuming we have a different prctl() for setting TCF, what would we
> report in the current PR_GET_ for existing apps when only the asymm mode
> was set? By the same logic as the TCF mask breaking current apps, this
> doesn't work if somehow an app uses the information and, for example,
> retry a faulting operation indefinitely.

One option may be to reject the GET with EINVAL if ASYMM is set. This
should hopefully result in the application acting as if MTE is not
enabled/supported. Then ASYMM-aware applications can do something like
pass arg2=1 to get the other bits regardless. We wouldn't do anything
like this for ptrace() (i.e. it would always return the exposed bits
of tagged_addr_ctrl regardless of ASYMM setting), but ptrace() seems
less of a concern since it's only used by debuggers and such.

Peter



More information about the linux-arm-kernel mailing list