[PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1

Mark Rutland mark.rutland at arm.com
Thu Oct 28 02:03:07 PDT 2021


On Thu, Oct 28, 2021 at 08:46:10AM +0100, Will Deacon wrote:
> On Wed, Oct 27, 2021 at 12:09:47PM +0100, Mark Rutland wrote:
> > On Tue, Oct 26, 2021 at 11:44:51PM -0700, Reiji Watanabe wrote:
> > > On Tue, Oct 26, 2021 at 5:23 AM Mark Rutland <mark.rutland at arm.com> wrote:
> > > > On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote:
> > > > > On 2021-10-26 04:48, Reiji Watanabe wrote:
> > > > > > Currently, clear_page() uses DC ZVA instruction unconditionally.  But it
> > > > > > should make sure that DCZID_EL0.DZP, which indicates whether or not use
> > > > > > of DC ZVA instruction is prohibited, is zero when using the instruction.
> > > > > > Use stp as memset does instead when DCZID_EL0.DZP == 1.
> > > > > >
> > > > > > Signed-off-by: Reiji Watanabe <reijiw at google.com>
> > > > > > ---
> > > > > >   arch/arm64/lib/clear_page.S | 11 +++++++++++
> > > > > >   1 file changed, 11 insertions(+)
> > > > > >
> > > > > > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
> > > > > > index b84b179edba3..7ce1bfa4081c 100644
> > > > > > --- a/arch/arm64/lib/clear_page.S
> > > > > > +++ b/arch/arm64/lib/clear_page.S
> > > > > > @@ -16,6 +16,7 @@
> > > > > >    */
> > > > > >   SYM_FUNC_START_PI(clear_page)
> > > > > >     mrs     x1, dczid_el0
> > > > > > +   tbnz    x1, #4, 2f      /* Branch if DC GVA is prohibited */
> > > >
> > > > DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA}
> > > > are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to
> > > > s/GVA/ZVA/ here.
> > > 
> > > Thank you for catching it ! I will fix that.
> > > 
> > > > Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(),
> > > > which'll need a similar update...
> > > 
> > > Yes, I'm aware of that and mte_zero_clear_page_tags() needs to get
> > > updated as well.  But, Since I'm not familiar with MTE (and I don't
> > > have any plans to use MTE yet), I didn't work on them (I'm not sure
> > > how I can test them).
> > > I might try to fix them separately later as well when I have time
> > > (not so soon most likely though).
> > 
> > My view is that we should either:
> > 
> > * Document that we require DCZID_EL0.DZP==0, as is implicitly the case
> >   today.
> 
> I disagree with that. There's nothing wrong in trapping this stuff, as long
> as you go ahead and emulate it, which is exactly why we aren't checking it at
> the moment as EL2 should be prepared to handle the trap. The Arm ARM talks
> about the instructions being "prohibited" but that doesn't mean anything --
> the reality is that they trap to EL2.

TBH, I think trapping DC ZVA rather than forcing it to be UNDEF was an
architectural mistake.

I think the intent of the "prohibited" wording (and exposure of
DCZID_EL0.DZP to EL0 and EL1) is clearly that SW should *avoid* DC ZVA and
friends when DCZID_EL0.DZP is set (and libc and the Arm optimized
routines have *always* checked that), and performance would be abysmal
with emulated DC ZVA, so at minimum we want to avoid hitting emulation
in the common case.

> We could document *that* though?

I guess; though it makes me uneasy since the architecture clearly pushes
people to read CTR_EL0.DZP, and heavily implies that there's no need to
emulate, so I don't think we have much of a leg to stand on from an
architecture PoV.

If we need to support this, my preference would be to support
DCZID_EL0.DZP==1 by avoiding the trapped instructions entirely.

I realise the problem as ever is "what does userspace do?".

Thanks,
Mark.



More information about the linux-arm-kernel mailing list