[PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1
Mark Rutland
mark.rutland at arm.com
Tue Oct 26 05:23:00 PDT 2021
On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote:
> On 2021-10-26 04:48, Reiji Watanabe wrote:
> > Currently, clear_page() uses DC ZVA instruction unconditionally. But it
> > should make sure that DCZID_EL0.DZP, which indicates whether or not use
> > of DC ZVA instruction is prohibited, is zero when using the instruction.
> > Use stp as memset does instead when DCZID_EL0.DZP == 1.
> >
> > Signed-off-by: Reiji Watanabe <reijiw at google.com>
> > ---
> > arch/arm64/lib/clear_page.S | 11 +++++++++++
> > 1 file changed, 11 insertions(+)
> >
> > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
> > index b84b179edba3..7ce1bfa4081c 100644
> > --- a/arch/arm64/lib/clear_page.S
> > +++ b/arch/arm64/lib/clear_page.S
> > @@ -16,6 +16,7 @@
> > */
> > SYM_FUNC_START_PI(clear_page)
> > mrs x1, dczid_el0
> > + tbnz x1, #4, 2f /* Branch if DC GVA is prohibited */
DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA}
are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to
s/GVA/ZVA/ here.
Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(),
which'll need a similar update...
> > and w1, w1, #0xf
> > mov x2, #4
> > lsl x1, x2, x1
> > @@ -25,5 +26,15 @@ SYM_FUNC_START_PI(clear_page)
> > tst x0, #(PAGE_SIZE - 1)
> > b.ne 1b
> > ret
> > +
> > +2: mov x1, #(PAGE_SIZE)
> > + sub x0, x0, #16 /* Pre-bias. */
>
> Out of curiosity, what's this for? It's not like we need to worry about
> PAGE_SIZE or page addresses being misaligned. I don't really see why we'd
> need a different condition from the DC ZVA loop.
I believe this was copied from arch/arm64/lib/memset.S, in the
`.Lnot_short` case, where we have:
| .Lnot_short:
| sub dst, dst, #16/* Pre-bias. */
| sub count, count, #64
| 1:
| stp A_l, A_l, [dst, #16]
| stp A_l, A_l, [dst, #32]
| stp A_l, A_l, [dst, #48]
| stp A_l, A_l, [dst, #64]!
| subs count, count, #64
| b.ge 1b
> Robin.
>
> > +3: stp xzr, xzr, [x0, #16]
> > + stp xzr, xzr, [x0, #32]
> > + stp xzr, xzr, [x0, #48]
> > + stp xzr, xzr, [x0, #64]!
> > + subs x1, x1, #64
> > + b.gt 3b
> > + ret
FWIW, I'd also prefer consistency with the existing loop, i.e.
2: stp xzr, xzr, [x0, #0]
stp xzr, xzr, [x0, #16]
stp xzr, xzr, [x0, #32]
stp xzr, xzr, [x0, #48]
add x0, x0, #64
tst x0, #(PAGE_SIZE - 1)
b.ne 2b
ret
Thanks,
Mark.
> > SYM_FUNC_END_PI(clear_page)
> > EXPORT_SYMBOL(clear_page)
> >
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
More information about the linux-arm-kernel
mailing list