[PATCH v7 29/29] arm64: mte: Add Memory Tagging Extension documentation

Wed Aug 12 08:45:21 EDT 2020

The 08/11/2020 18:20, Catalin Marinas wrote:
> On Mon, Aug 10, 2020 at 03:13:09PM +0100, Szabolcs Nagy wrote:
> > The 08/07/2020 16:19, Catalin Marinas wrote:
> > > On Mon, Aug 03, 2020 at 01:43:10PM +0100, Szabolcs Nagy wrote:
> > > > if we can always turn sync tag checks on early whenever mte is
> > > > available then i think there is no issue.
> > > > 
> > > > but if we have to make the decision later for compatibility or
> > > > performance reasons then per thread setting is problematic.
> > > 
> > > At least for libc, I'm not sure how you could even turn MTE on at
> > > run-time. The heap allocations would have to be mapped with PROT_MTE as
> > > we can't easily change them (well, you could mprotect(), assuming the
> > > user doesn't use tagged pointers on them).
> > 
> > e.g. dlopen of library with stack tagging. (libc can mark stacks with
> > PROT_MTE at that time)
> 
> If we allow such mixed object support with stack tagging enabled at
> dlopen, PROT_MTE would need to be turned on for each thread stack. This
> wouldn't require synchronisation, only knowing where the thread stacks
> are, but you'd need to make sure threads don't call into the new library
> until the stacks have been mprotect'ed. Doing this midway through a
> function execution may corrupt the tags.
> 
> So I'm not sure how safe any of this is without explicit user
> synchronisation (i.e. don't call into the library until all threads have
> been updated). Even changing options like GCR_EL1.Excl across multiple
> threads may have unwanted effects. See this comment from Peter, the
> difference being that instead of an explicit prctl() call on the current
> stack, another thread would do it:
> 
> https://lore.kernel.org/linux-arch/CAMn1gO5rhOG1W+nVe103v=smvARcFFp_Ct9XqH2Ca4BUMfpDdg@mail.gmail.com/

there is no midway problem: the libc (ld.so) would do
the PROT_MTE at dlopen time based on some elf marking
(which can be handled before relocation processing,
so before library code can run, the midway problem
happens when a library, e.g libc, wants to turn on
stack tagging on itself).

the libc already does this when a library is loaded
that requires executable stack (it marks stacks as
PROT_EXEC at dlopen time or fails the dlopen if that
is not possible, this does not require running code
in other threads, only synchronization with thread
creation and exit. but changing the check mode for
mte needs per thread code execution.).

i'm not entirely sure if this is a good idea, but i
expect stack tagging not to be used in the libc
(because libc needs to run on all hw and we don't yet
have a backward compatible stack tagging solution),
so stack tagging should work when only some elf modules
in a process are built with it, which implies that
enabling it at dlopen time should work otherwise
it will not be very useful.

> > or just turn on sync tag checks later when using heap tagging.
> 
> I wonder whether setting the synchronous tag check mode by default would
> improve this aspect. This would not have any effect until PROT_MTE is
> used. If software wants some better performance they can explicitly opt
> in to asynchronous mode or disable tag checking after some SIGSEGV +
> reporting (this shouldn't exclude the environment variables you
> currently use for controlling the tag check mode).
> 
> Also, if there are saner defaults for the user GCR_EL1.Excl (currently
> all masked), we should decide them now.
> 
> If stack tagging will come with some ELF information, we could make the
> default tag checking and GCR_EL1.Excl choices based on that, otherwise
> maybe we should revisit the default configuration the kernel sets for
> the user in the absence of any other information.

do tag checks have overhead if PROT_MTE is not used?
i'd expect some checks are still done at memory access.
(and the tagged address syscall abi has to be in use.)

turning sync tag checks on early would enable the
most of the interesting usecases (only PROT_MTE has
to be handled at runtime not the prctls. however i
don't yet know how userspace will deal with compat
issues, i.e. it may not be valid to unconditionally
turn tag checks on early).

> > > > - library code normally initializes per thread state on the first call
> > > >   into the library from a given thread, but with mte, as soon as
> > > >   memory / pointers are tagged in one thread, all threads are
> > > >   affected: not performing checks in other threads is less secure (may
> > > >   be ok) and it means incompatible syscall abi (not ok). so at least
> > > >   PR_TAGGED_ADDR_ENABLE should have process wide setting for this
> > > >   usage.
> > > 
> > > My assumption with MTE is that the libc will initialise it when the
> > > library is loaded (something __attribute__((constructor))) and it's
> > > still in single-threaded mode. Does it wait until the first malloc()
> > > call? Also, is there such thing as a per-thread initialiser for a
> > > dynamic library (not sure it can be implemented in practice though)?
> > 
> > there is no per thread initializer in an elf module.
> > (tls state is usually initialized lazily in threads
> > when necessary.)
> > 
> > malloc calls can happen before the ctors of an LD_PRELOAD
> > library and threads can be created before both.
> > glibc runs ldpreload ctors after other library ctors.
> 
> In the presence of stack tagging, I think any subsequent MTE config
> change across all threads is unsafe, irrespective of whether it's done
> by the kernel or user via SIGUSRx. I think the best we can do here is
> start with more appropriate defaults or enable them based on an ELF note
> before the application is started. The dynamic loader would not have to
> do anything extra here.
> 
> If we ignore stack tagging, the global configuration change may be
> achievable. I think for the MTE bits, this could be done lazily by the
> libc (e.g. on malloc()/free() call). The tag checking won't happen
> before such calls unless we change the kernel defaults. There is still
> the tagged address ABI enabling, could this be done lazily on syscall by
> the libc? If not, the kernel could synchronise (force) this on syscall
> entry from each thread based on some global prctl() bit.

i think the interesting use-cases are all about
changing mte settings before mte is in use in any
way but after there are multiple threads.
(the async -> sync mode change on tag faults is
i think less interesting to the gnu linux world.)

i guess lazy syscall abi switch works, but it is
ugly: raw syscall usage will be problematic and
doing checks before calling into the vdso might
have unwanted overhead.

based on the discussion it seems we should design
the userspace abis so that per process prctl is
not required and then see how far we get.