[PATCH 1/5] ARM: pgtable: switch order of Linux vs hardware page tables
Catalin Marinas
catalin.marinas at arm.com
Fri Nov 26 09:41:29 EST 2010
On Fri, 2010-11-26 at 11:38 +0000, Russell King - ARM Linux wrote:
> On Fri, Nov 19, 2010 at 11:48:31AM +0000, Catalin Marinas wrote:
> > On 17 November 2010 17:28, Russell King - ARM Linux
> > <linux at arm.linux.org.uk> wrote:
> > > --- a/arch/arm/mm/proc-v7.S
> > > +++ b/arch/arm/mm/proc-v7.S
> > > @@ -158,7 +156,7 @@ ENTRY(cpu_v7_set_pte_ext)
> > > tstne r1, #L_PTE_PRESENT
> > > moveq r3, #0
> > >
> > > - str r3, [r0]
> > > + str r3, [r0, #2048]!
> >
> > Thumb-2 build gives "offset out of range". We need to do a separate
> > ADD for this case.
>
> Do we have any clues about the typical timing of:
>
> str r3, [r0, #2048]!
> mcr p15, 0, r0, c7, c10, 1
>
> vs:
> add r0, r0, #2048
> str r3, [r0]
> mcr p15, 0, r0, c7, c10, 1
>
> or
> str r3, [r0, #2048]
> add r0, r0, #2048
> mcr p15, 0, r0, c7, c10, 1
>
> on ARMv7?
Since there is an address (r0) dependency in the last mcr, all three may
take the same number of cycles.
For T2, the last one could be better, generally, since the str has a bit
more time available before the cache flushing.
For ARM, the advantage of the first one (writeback) is that we don't use
another instruction and have more room in the prefetch buffer, though
not sure this would be noticeable. You could use some ARM/THUMB macros.
--
Catalin
More information about the linux-arm-kernel
mailing list