[PATCH v5] arm64: mte: allow async MTE to be upgraded to sync on a per-CPU basis

Will Deacon will at kernel.org
Wed Jul 7 07:11:15 PDT 2021


On Wed, Jul 07, 2021 at 01:55:45PM +0100, Catalin Marinas wrote:
> On Wed, Jul 07, 2021 at 11:30:04AM +0100, Will Deacon wrote:
> > On Wed, Jun 30, 2021 at 04:39:02PM -0700, Peter Collingbourne wrote:
> > > On Wed, Jun 30, 2021 at 8:19 AM Catalin Marinas <catalin.marinas at arm.com> wrote:
> > > > On Tue, Jun 29, 2021 at 12:11:17PM -0700, Peter Collingbourne wrote:
> > > > > On Tue, Jun 29, 2021 at 3:46 AM Will Deacon <will at kernel.org> wrote:
> > > > > > On Mon, Jun 28, 2021 at 04:20:24PM +0100, Catalin Marinas wrote:
> > > > > > > Another option is a mapping table where async can be remapped to sync
> > > > > > > and sync to async (or even to "none" for both). That's not far from one
> > > > > > > of Peter's mte-upgrade-async proposal, we just add mte-map-async and
> > > > > > > mte-map-sync options. Most likely we'll just use mte-map-async for now
> > > > > > > to map it to sync on some CPUs but it wouldn't exclude other forced
> > > > > > > settings.
> > > > > >
> > > > > > Catalin and I discussed this offline and ended up with another option:
> > > > > > retrospectively change the prctl() ABI so that the 'flags' argument
> > > > > > accepts a bitmask of modes that the application is willing to accept. This
> > > > > > doesn't break any existing users, as we currently enforce that only one
> > > > > > mode is specified, but it would allow things like:
> > > > > >
> > > > > >         prctl(PR_SET_TAGGED_ADDR_CTRL,
> > > > > >               PR_MTE_TCF_SYNC | PR_MTE_TCF_ASYNC,
> > > > > >               0, 0, 0);
> > > > > >
> > > > > > which is actually very similar to Peter's PR_MTE_DYNAMIC_TCF proposal, with
> > > > > > the difference that I think this extends more naturally as new PR_MTR_TCF_*
> > > > > > flags are introduced.
> > > > > >
> > > > > > Then we expose a per-cpu file in sysfs (say "cpuX/mte_tcf_preferred")
> > > > > > which initially reads as "async". If the root user does, e.g.
> > > > > >
> > > > > >         # echo "sync" > cpu1/mte_tcf_preferred
> > > > > >
> > > > > > then a task which has successfully issued a PR_SET_TAGGED_ADDR_CTRL prctl()
> > > > > > request for PR_MTE_TCF_SYNC | PR_MTE_TCF_ASYNC will run in sync mode on
> > > > > > CPU1, but async mode on other CPUs (assuming they retain the default value).
> > > > > >
> > > > > > We'll need to special-case PR_MTE_TCF_NONE, as that's just a shorthand for
> > > > > > "no flags" so doing PR_MTE_TCF_NONE | PR_MTE_TCF_SYNC is just the same as
> > > > > > doing PR_MTE_TCF_SYNC (which I think is already the behaviour today). The
> > > > > > only values which the sysfs files would accept today are "sync" and "async".
> > > > > >
> > > > > > When faced with a situation where the prctl() flags for a task do not
> > > > > > intersect with the preferred mode for a CPU on which the task is going
> > > > > > to run, the lowest bit number flag is chosen from the mask set by the
> > > > > > prctl().
> > > > > >
> > > > > > Thoughts?
> > > > >
> > > > > This all sounds great and I'm glad you were able to come to an
> > > > > agreement on this. I'll get started on implementing it.
> > > > >
> > > > > Once we have ASYM support I'm not sure if we can rely on bit numbering
> > > > > for ordering, as we will want the ordering to be ASYNC < ASYM < SYNC
> > > > > and ASYNC is bit-adjacent to SYNC. So I think we will need to make
> > > > > ASYM a special case.
> > > >
> > > > The bit position based order - SYNC < ASYNC < ASYM - is indeed arbitrary
> > > > but I think it's easier to follow. When we add ASYM, it will be the last
> > > > one rather than squeezing it in the middle of the current order. Any
> > > > other order somehow implies that one is better than the other but we
> > > > don't have a clear definition for "better".
> > > 
> > > At least from my perspective "more strict" is a reasonable enough
> > > definition for "better", at least by default, since it seems like what
> > > most users would want. If users really want a different ordering,
> > > perhaps it can be an opt-in behavior.
> > 
> > I think some of this hinges on how we want to handle something like a
> > request for PR_MTE_TCF_ASYM | PR_MTE_TCF_ASYNC on a system which doesn't
> > have hardware support for PR_MTE_TCF_ASYM. If this returns an error from
> > the prctl(), then I'm not strongly opposed to having a "more strict"
> > priority ordering for the options, but if this was to succeed, then I think
> > we need something like the bit-position ordering so that the user-visible
> > behaviour doesn't change in a non-discoverable manner based on what the CPU
> > supports.
> 
> I replied on a subsequent update to this series and I think leaving it
> as imp def is best ;) (as with other ARM architecture behaviours).
> 
> https://lore.kernel.org/r/20210701170450.GC12484@arm.com

I agree for now as there are only two flags, but I think we'll need to
define the behaviour once we add a third flag because it's a bit rubbish
from a userspace perspective otherwise.

Will



More information about the linux-arm-kernel mailing list