[PATCH v2] arm64: mte: avoid clearing PSTATE.TCO on entry unless necessary

Catalin Marinas catalin.marinas at arm.com
Fri Dec 17 10:16:56 PST 2021


On Wed, Dec 15, 2021 at 06:44:03PM -0800, Peter Collingbourne wrote:
> On Fri, Dec 10, 2021 at 4:06 AM Catalin Marinas <catalin.marinas at arm.com> wrote:
> > On Wed, Nov 10, 2021 at 02:07:35PM -0800, Peter Collingbourne wrote:
> > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > index 2f69ae43941d..a78ec15f5bbc 100644
> > > --- a/arch/arm64/kernel/entry.S
> > > +++ b/arch/arm64/kernel/entry.S
> > > @@ -189,6 +189,27 @@ alternative_cb_end
> > >  #endif
> > >       .endm
> > >
> > > +     .macro mte_clear_tco, sctlr
> > > +     /*
> > > +      * Re-enable tag checking (TCO set on exception entry). This is only
> > > +      * necessary if MTE is enabled in either the kernel or the userspace
> > > +      * task in synchronous mode. With MTE disabled in the kernel and
> > > +      * disabled or asynchronous in userspace, tag check faults (including in
> > > +      * uaccesses) are not reported, therefore there is no need to re-enable
> > > +      * checking. This is beneficial on microarchitectures where re-enabling
> > > +      * TCO is expensive.
> > > +      */
> > > +#ifdef CONFIG_ARM64_MTE
> > > +alternative_cb       kasan_hw_tags_enable
> > > +     tbz     \sctlr, #SCTLR_EL1_TCF0_SHIFT, 1f
> > > +alternative_cb_end
> > > +alternative_if ARM64_MTE
> > > +     SET_PSTATE_TCO(0)
> > > +alternative_else_nop_endif
> > > +1:
> > > +#endif
> > > +     .endm
> >
> > The patch looks fine to me but I recall in a private conversation with
> > Mark he proposed the idea of moving this to entry-common.c (unless it
> > was about something completely different). The downside is that we run
> > with the TCO set for slightly longer. There shouldn't be a major
> > drawback currently as we don't have stack tagging anyway.
> 
> Yes, Mark made that suggestion on the list. I tried it and found that
> it led to a performance regression relative to baseline [1].
[...]
> [1] https://lore.kernel.org/all/CAMn1gO51k1Dqts=THYq28nVMSvO6ZQB5sEG1wuzEVpAXBTfFjg@mail.gmail.com/

That's weird since there should be the same number of instructions
executed on entry with your diff above. Could it be that
mte_disable_tco_entry() is not inlined?

-- 
Catalin



More information about the linux-arm-kernel mailing list