prefetch inconsistency in copy_page()

Will Deacon will.deacon at arm.com
Tue Jul 11 11:26:00 PDT 2017


On Tue, Jul 11, 2017 at 06:58:54PM +0100, Ard Biesheuvel wrote:
> On 11 July 2017 at 18:53, Pinski, Andrew <Andrew.Pinski at cavium.com> wrote:
> >> -----Original Message-----
> >> From: Will Deacon [mailto:will.deacon at arm.com]
> >> Sent: Tuesday, July 11, 2017 10:53 AM
> >> To: Ard Biesheuvel <ard.biesheuvel at linaro.org>
> >> Cc: Mark Rutland <mark.rutland at arm.com>; Pinski, Andrew
> >> <Andrew.Pinski at cavium.com>; linux-arm-kernel at lists.infradead.org
> >> Subject: Re: prefetch inconsistency in copy_page()
> >>
> >> On Tue, Jul 11, 2017 at 10:42:19AM +0100, Ard Biesheuvel wrote:
> >> > Hi all,
> >> >
> >> > No big deal, but I spotted an inconsistency in how copy_page() does
> >> > its prefetching. With the code as-is, it prefetches 2 lines at the
> >> > start of the function, but it is off by one line in the loop, i.e., it
> >> > prefetches 3 lines ahead (and skips a line at entry). So imo, we need
> >> > either
> >> >
> >> > """
> >> > @@ -30,9 +30,10 @@
> >> >   */
> >> >  ENTRY(copy_page)
> >> >  alternative_if ARM64_HAS_NO_HW_PREFETCH
> >> > -       # Prefetch two cache lines ahead.
> >> > +       # Prefetch three cache lines ahead.
> >> >         prfm    pldl1strm, [x1, #128]
> >> >         prfm    pldl1strm, [x1, #256]
> >> > +       prfm    pldl1strm, [x1, #384]
> >> >  alternative_else_nop_endif
> >> >
> >> >         ldp     x2, x3, [x1]
> >> > """
> >> >
> >> > or
> >> >
> >> > """
> >> > @@ -50,7 +51,7 @@ alternative_else_nop_endif
> >> >         subs    x18, x18, #128
> >> >
> >> >  alternative_if ARM64_HAS_NO_HW_PREFETCH
> >> > -       prfm    pldl1strm, [x1, #384]
> >> > +       prfm    pldl1strm, [x1, #256]
> >> >  alternative_else_nop_endif
> >> >
> >> >         stnp    x2, x3, [x0]
> >> >
> >> > """
> >> >
> >> > to make things consistent again.
> >> >
> >> > Thoughts?
> >>
> >> Either of those look good to me. Andrew -- any preference? If not, I'll
> >> take the second one.
> >
> > The first one is my preference but both are ok to me.
> >
> 
> I suppose the second one is the least likely to invalidate existing
> benchmark results.
> 
> Will: should I spin it as a proper patch? Mind if I clean up the
> whitespace at the same time?

Fill your boots!

Will



More information about the linux-arm-kernel mailing list