[PATCH v5] arm64: mte: allow async MTE to be upgraded to sync on a per-CPU basis

Will Deacon will at kernel.org
Wed Jul 7 03:30:04 PDT 2021


On Wed, Jun 30, 2021 at 04:39:02PM -0700, Peter Collingbourne wrote:
> On Wed, Jun 30, 2021 at 8:19 AM Catalin Marinas <catalin.marinas at arm.com> wrote:
> >
> > On Tue, Jun 29, 2021 at 12:11:17PM -0700, Peter Collingbourne wrote:
> > > On Tue, Jun 29, 2021 at 3:46 AM Will Deacon <will at kernel.org> wrote:
> > > > On Mon, Jun 28, 2021 at 04:20:24PM +0100, Catalin Marinas wrote:
> > > > > Another option is a mapping table where async can be remapped to sync
> > > > > and sync to async (or even to "none" for both). That's not far from one
> > > > > of Peter's mte-upgrade-async proposal, we just add mte-map-async and
> > > > > mte-map-sync options. Most likely we'll just use mte-map-async for now
> > > > > to map it to sync on some CPUs but it wouldn't exclude other forced
> > > > > settings.
> > > >
> > > > Catalin and I discussed this offline and ended up with another option:
> > > > retrospectively change the prctl() ABI so that the 'flags' argument
> > > > accepts a bitmask of modes that the application is willing to accept. This
> > > > doesn't break any existing users, as we currently enforce that only one
> > > > mode is specified, but it would allow things like:
> > > >
> > > >         prctl(PR_SET_TAGGED_ADDR_CTRL,
> > > >               PR_MTE_TCF_SYNC | PR_MTE_TCF_ASYNC,
> > > >               0, 0, 0);
> > > >
> > > > which is actually very similar to Peter's PR_MTE_DYNAMIC_TCF proposal, with
> > > > the difference that I think this extends more naturally as new PR_MTR_TCF_*
> > > > flags are introduced.
> > > >
> > > > Then we expose a per-cpu file in sysfs (say "cpuX/mte_tcf_preferred")
> > > > which initially reads as "async". If the root user does, e.g.
> > > >
> > > >         # echo "sync" > cpu1/mte_tcf_preferred
> > > >
> > > > then a task which has successfully issued a PR_SET_TAGGED_ADDR_CTRL prctl()
> > > > request for PR_MTE_TCF_SYNC | PR_MTE_TCF_ASYNC will run in sync mode on
> > > > CPU1, but async mode on other CPUs (assuming they retain the default value).
> > > >
> > > > We'll need to special-case PR_MTE_TCF_NONE, as that's just a shorthand for
> > > > "no flags" so doing PR_MTE_TCF_NONE | PR_MTE_TCF_SYNC is just the same as
> > > > doing PR_MTE_TCF_SYNC (which I think is already the behaviour today). The
> > > > only values which the sysfs files would accept today are "sync" and "async".
> > > >
> > > > When faced with a situation where the prctl() flags for a task do not
> > > > intersect with the preferred mode for a CPU on which the task is going
> > > > to run, the lowest bit number flag is chosen from the mask set by the
> > > > prctl().
> > > >
> > > > Thoughts?
> > >
> > > This all sounds great and I'm glad you were able to come to an
> > > agreement on this. I'll get started on implementing it.
> > >
> > > Once we have ASYM support I'm not sure if we can rely on bit numbering
> > > for ordering, as we will want the ordering to be ASYNC < ASYM < SYNC
> > > and ASYNC is bit-adjacent to SYNC. So I think we will need to make
> > > ASYM a special case.
> >
> > The bit position based order - SYNC < ASYNC < ASYM - is indeed arbitrary
> > but I think it's easier to follow. When we add ASYM, it will be the last
> > one rather than squeezing it in the middle of the current order. Any
> > other order somehow implies that one is better than the other but we
> > don't have a clear definition for "better".
> 
> At least from my perspective "more strict" is a reasonable enough
> definition for "better", at least by default, since it seems like what
> most users would want. If users really want a different ordering,
> perhaps it can be an opt-in behavior.

I think some of this hinges on how we want to handle something like a
request for PR_MTE_TCF_ASYM | PR_MTE_TCF_ASYNC on a system which doesn't
have hardware support for PR_MTE_TCF_ASYM. If this returns an error from
the prctl(), then I'm not strongly opposed to having a "more strict"
priority ordering for the options, but if this was to succeed, then I think
we need something like the bit-position ordering so that the user-visible
behaviour doesn't change in a non-discoverable manner based on what the CPU
supports.

But yes, for now this is all moot with only two options.

Will



More information about the linux-arm-kernel mailing list