[PATCH] arm64: mte: avoid TFSR related operations unless in async mode

Peter Collingbourne pcc at google.com
Thu Jul 1 11:11:34 PDT 2021


On Thu, Jul 1, 2021 at 10:37 AM Catalin Marinas <catalin.marinas at arm.com> wrote:
>
> On Wed, Jun 30, 2021 at 08:14:48PM -0700, Peter Collingbourne wrote:
> > There is no reason to touch TFSR nor issue a DSB unless our task is
> > in asynchronous mode. Since these operations (especially the DSB)
> > may be expensive on certain microarchitectures, only perform them
> > if necessary.
>
> Have you noticed any performance degradation in practice?

Yes, we measured a degradation on the hardware that we have access to.
But as I'm sure you can understand I have to be a bit vague about the
numbers here.

> > Signed-off-by: Peter Collingbourne <pcc at google.com>
> > Link: https://linux-review.googlesource.com/id/Ib353a63e3d0abc2b0b008e96aa2d9692cfc1b815
> > ---
> >  arch/arm64/kernel/entry.S | 27 +++++++++++++++++++--------
> >  1 file changed, 19 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > index 863d44f73028..c2338414c558 100644
> > --- a/arch/arm64/kernel/entry.S
> > +++ b/arch/arm64/kernel/entry.S
> > @@ -133,12 +133,18 @@ alternative_cb_end
> >       .endm
> >
> >       /* Check for MTE asynchronous tag check faults */
> > -     .macro check_mte_async_tcf, tmp, ti_flags
> > +     .macro check_mte_async_tcf, tmp, ti_flags, thread_sctlr
> >  #ifdef CONFIG_ARM64_MTE
> >       .arch_extension lse
> >  alternative_if_not ARM64_MTE
> >       b       1f
> >  alternative_else_nop_endif
> > +     /*
> > +      * Asynchronous tag check faults are only possible in ASYNC (2) or
> > +      * ASYM (3) modes. In each of these modes bit 1 of SCTLR_EL1.TCF0 is
> > +      * set, so skip the check if it is unset.
> > +      */
> > +     tbz     \thread_sctlr, #(SCTLR_EL1_TCF0_SHIFT + 1), 1f
> >       mrs_s   \tmp, SYS_TFSRE0_EL1
> >       tbz     \tmp, #SYS_TFSR_EL1_TF0_SHIFT, 1f
>
> I don't think this one is that expensive.
>
> >       /* Asynchronous TCF occurred for TTBR0 access, set the TI flag */
> > @@ -151,11 +157,14 @@ alternative_else_nop_endif
> >       .endm
> >
> >       /* Clear the MTE asynchronous tag check faults */
> > -     .macro clear_mte_async_tcf
> > +     .macro clear_mte_async_tcf thread_sctlr
> >  #ifdef CONFIG_ARM64_MTE
> >  alternative_if ARM64_MTE
> > +     /* See comment in check_mte_async_tcf above. */
> > +     tbz     \thread_sctlr, #(SCTLR_EL1_TCF0_SHIFT + 1), 1f
> >       dsb     ish
> >       msr_s   SYS_TFSRE0_EL1, xzr
> > +1:
>
> Here, maybe, as we have a DSB.

Yes, disabling clear_mte_async_tcf offered an order of magnitude
larger speedup than disabing check_mte_async_tcf, presumably due to
the DSB. I would reckon though that if we're going to make some of the
code conditional on TCF we might as well make all of it conditional in
order to get the maximum possible benefit.

Nevertheless, isn't it the case that disabling check_mte_async_tcf for
non-ASYNC tasks is necessary for correctness if we want to disable
clear_mte_async_tcf? Imagine that we just disable clear_mte_async_tcf,
and then we get a tag check failing uaccess in a TCF=ASYNC task which
then gets preempted by a TCF=NONE task which will skip clear on kernel
exit. If we don't disable check on kernel entry then I believe that we
will get a false positive tag check fault in the TCF=NONE task the
next time it enters the kernel.

Peter



More information about the linux-arm-kernel mailing list