[PATCH 1/5] ARM: pgtable: switch order of Linux vs hardware page tables

Catalin Marinas catalin.marinas at arm.com
Fri Nov 26 09:41:29 EST 2010


On Fri, 2010-11-26 at 11:38 +0000, Russell King - ARM Linux wrote:
> On Fri, Nov 19, 2010 at 11:48:31AM +0000, Catalin Marinas wrote:
> > On 17 November 2010 17:28, Russell King - ARM Linux
> > <linux at arm.linux.org.uk> wrote:
> > > --- a/arch/arm/mm/proc-v7.S
> > > +++ b/arch/arm/mm/proc-v7.S
> > > @@ -158,7 +156,7 @@ ENTRY(cpu_v7_set_pte_ext)
> > >        tstne   r1, #L_PTE_PRESENT
> > >        moveq   r3, #0
> > >
> > > -       str     r3, [r0]
> > > +       str     r3, [r0, #2048]!
> >
> > Thumb-2 build gives "offset out of range".  We need to do a separate
> > ADD for this case.
> 
> Do we have any clues about the typical timing of:
> 
>         str     r3, [r0, #2048]!
>         mcr     p15, 0, r0, c7, c10, 1
> 
> vs:
>         add     r0, r0, #2048
>         str     r3, [r0]
>         mcr     p15, 0, r0, c7, c10, 1
> 
> or
>         str     r3, [r0, #2048]
>         add     r0, r0, #2048
>         mcr     p15, 0, r0, c7, c10, 1
> 
> on ARMv7?

Since there is an address (r0) dependency in the last mcr, all three may
take the same number of cycles.

For T2, the last one could be better, generally, since the str has a bit
more time available before the cache flushing.

For ARM, the advantage of the first one (writeback) is that we don't use
another instruction and have more room in the prefetch buffer, though
not sure this would be noticeable. You could use some ARM/THUMB macros.

-- 
Catalin





More information about the linux-arm-kernel mailing list