[PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1

Mark Rutland mark.rutland at arm.com
Tue Oct 26 05:23:00 PDT 2021


On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote:
> On 2021-10-26 04:48, Reiji Watanabe wrote:
> > Currently, clear_page() uses DC ZVA instruction unconditionally.  But it
> > should make sure that DCZID_EL0.DZP, which indicates whether or not use
> > of DC ZVA instruction is prohibited, is zero when using the instruction.
> > Use stp as memset does instead when DCZID_EL0.DZP == 1.
> > 
> > Signed-off-by: Reiji Watanabe <reijiw at google.com>
> > ---
> >   arch/arm64/lib/clear_page.S | 11 +++++++++++
> >   1 file changed, 11 insertions(+)
> > 
> > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
> > index b84b179edba3..7ce1bfa4081c 100644
> > --- a/arch/arm64/lib/clear_page.S
> > +++ b/arch/arm64/lib/clear_page.S
> > @@ -16,6 +16,7 @@
> >    */
> >   SYM_FUNC_START_PI(clear_page)
> >   	mrs	x1, dczid_el0
> > +	tbnz	x1, #4, 2f	/* Branch if DC GVA is prohibited */

DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA}
are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to
s/GVA/ZVA/ here.

Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(),
which'll need a similar update...

> >   	and	w1, w1, #0xf
> >   	mov	x2, #4
> >   	lsl	x1, x2, x1
> > @@ -25,5 +26,15 @@ SYM_FUNC_START_PI(clear_page)
> >   	tst	x0, #(PAGE_SIZE - 1)
> >   	b.ne	1b
> >   	ret
> > +
> > +2:	mov	x1, #(PAGE_SIZE)
> > +	sub	x0, x0, #16	/* Pre-bias. */
> 
> Out of curiosity, what's this for? It's not like we need to worry about
> PAGE_SIZE or page addresses being misaligned. I don't really see why we'd
> need a different condition from the DC ZVA loop.

I believe this was copied from arch/arm64/lib/memset.S, in the
`.Lnot_short` case, where we have:

| .Lnot_short:
|         sub     dst, dst, #16/* Pre-bias.  */
|         sub     count, count, #64
| 1:
|         stp     A_l, A_l, [dst, #16]
|         stp     A_l, A_l, [dst, #32]
|         stp     A_l, A_l, [dst, #48]
|         stp     A_l, A_l, [dst, #64]!
|         subs    count, count, #64
|         b.ge    1b

> Robin.
> 
> > +3:	stp	xzr, xzr, [x0, #16]
> > +	stp	xzr, xzr, [x0, #32]
> > +	stp	xzr, xzr, [x0, #48]
> > +	stp	xzr, xzr, [x0, #64]!
> > +	subs	x1, x1, #64
> > +	b.gt	3b
> > +	ret

FWIW, I'd also prefer consistency with the existing loop, i.e.

2:	stp	xzr, xzr, [x0, #0]
	stp	xzr, xzr, [x0, #16]
	stp	xzr, xzr, [x0, #32]
	stp	xzr, xzr, [x0, #48]
	add	x0, x0, #64
	tst	x0, #(PAGE_SIZE - 1)
	b.ne	2b
	ret

Thanks,
Mark.

> >   SYM_FUNC_END_PI(clear_page)
> >   EXPORT_SYMBOL(clear_page)
> > 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



More information about the linux-arm-kernel mailing list