[RFC] mixture of cleanups to cache-v7.S

Lorenzo Pieralisi lorenzo.pieralisi at arm.com
Fri Apr 10 08:37:51 PDT 2015


On Fri, Apr 10, 2015 at 03:26:55PM +0100, Russell King - ARM Linux wrote:
> On Fri, Apr 10, 2015 at 03:11:05PM +0100, Lorenzo Pieralisi wrote:
> > On Fri, Apr 10, 2015 at 02:27:42PM +0100, Catalin Marinas wrote:
> > > As for the v7_flush_dcache_all always having the barriers, I guess
> > > no-one is expecting a v7 CPU without caches.
> > 
> > Well, yes, actually I'd rather remove the barriers if LoC is 0 instead
> > of adding them to LoUIS API in case there is nothing to flush, I do not
> > think that's a problem in the first place.
> 
> Maybe you should read through the patch set before declaring there to
> be no problem.
> 
> There may be no problem as the code stands right now, and we can bury
> our heads in the sand and continue as through there's no problem what
> so ever, and we'd be right to, but...
> 
> We need to have 643719 enabled for all platforms.  Right now, the code
> in cache-v7 surrounding that workaround is pretty crap - enabling the
> option places the workaround into the execution path whether it applies
> or not.
> 
> What this patch series does is rearrange stuff to avoid the workaround
> where it's possible to do so, while cleaning up a lot of the crap coding
> that was done here in the first place.
> 
> In doing this, I spotted that if we _solve_ the above issue by making
> both functions behave in the same way, we get more common code paths.

I certainly agree it is a good clean-up, I was referring to barriers and
the barrier semantics.

> So, while the issue doesn't _cause_ a problem, fixing the disparity
> between the two functions is worth it just to be able to have a
> flush_levels implementation which behaves the same no matter how it's
> called - one which always has a dmb before it, and (possibly) always
> the dsb+isb afterwards - or only has the dsb+isb afterwards if it's
> done some cache flushing.

In actual facts the latter is what's required, for practical
purposes the former solution is acceptable to fix the disparity.

> Look at the resulting code after applying all the patches, and I'm sure
> you'd agree that it's a lot better than the crap which is currently
> there.  To save you having to look up the files and apply the patches,
> here are the old and new versions:
> 
> ============= old ====================
> ENTRY(v7_flush_dcache_louis)
>         dmb                                     @ ensure ordering with previous memory accesses
>         mrc     p15, 1, r0, c0, c0, 1           @ read clidr, r0 = clidr
>         ALT_SMP(ands    r3, r0, #(7 << 21))     @ extract LoUIS from clidr
>         ALT_UP(ands     r3, r0, #(7 << 27))     @ extract LoUU from clidr
> #ifdef CONFIG_ARM_ERRATA_643719
>         ALT_SMP(mrceq   p15, 0, r2, c0, c0, 0)  @ read main ID register
>         ALT_UP(reteq    lr)                     @ LoUU is zero, so nothing to do        ldreq   r1, =0x410fc090                 @ ID of ARM Cortex A9 r0p?
>         biceq   r2, r2, #0x0000000f             @ clear minor revision number
>         teqeq   r2, r1                          @ test for errata affected core and if so...
>         orreqs  r3, #(1 << 21)                  @   fix LoUIS value (and set flags state to 'ne')
> #endif
>         ALT_SMP(mov     r3, r3, lsr #20)        @ r3 = LoUIS * 2
>         ALT_UP(mov      r3, r3, lsr #26)        @ r3 = LoUU * 2
>         reteq   lr                              @ return if level == 0
>         mov     r10, #0                         @ r10 (starting level) = 0
>         b       flush_levels                    @ start flushing cache levels
> 
> ENTRY(v7_flush_dcache_all)
>         dmb                                     @ ensure ordering with previous memory accesses
>         mrc     p15, 1, r0, c0, c0, 1           @ read clidr
>         ands    r3, r0, #0x7000000              @ extract loc from clidr
>         mov     r3, r3, lsr #23                 @ left align loc bit field
>         beq     finished                        @ if loc is 0, then no need to clean
>         mov     r10, #0                         @ start clean at cache level 0
> flush_levels:
>         add     r2, r10, r10, lsr #1            @ work out 3x current cache level
>         mov     r1, r0, lsr r2                  @ extract cache type bits from clidr
> ...
> ============= new ====================
> ENTRY(v7_flush_dcache_louis)
>         mrc     p15, 1, r0, c0, c0, 1           @ read clidr, r0 = clidr
> ALT_SMP(mov     r3, r0, lsr #20)                @ move LoUIS into position
> ALT_UP( mov     r3, r0, lsr #26)                @ move LoUU into position
> #ifdef CONFIG_ARM_ERRATA_643719
> ALT_SMP(ands    r3, r3, #7 << 1)                @ extract LoU*2 field from clidrALT_UP( b       start_flush_levels)
>         bne     start_flush_levels              @ LoU != 0, start flushing
>         mrc     p15, 0, r2, c0, c0, 0           @ read main ID register
>         movw    r1, #:lower16:(0x410fc090 >> 4) @ ID of ARM Cortex A9 r0p?
>         movt    r1, #:upper16:(0x410fc090 >> 4)
>         teq     r1, r2, lsr #4                  @ test for errata affected core
> and if so...
>         moveq   r3, #1 << 1                     @ fix LoUIS value (and set flags state to 'ne')
> #endif
>         b       start_flush_levels              @ start flushing cache levels
> 
> ENTRY(v7_flush_dcache_all)
>         mrc     p15, 1, r0, c0, c0, 1           @ read clidr
>         mov     r3, r0, lsr #23                 @ move LoC into position
> start_flush_levels:
>         dmb                                     @ ensure ordering with previous memory accesses
>         ands    r3, r3, #7 << 1                 @ extract field from clidr
>         beq     finished                        @ if loc is 0, then no need to clean
>         mov     r10, #0                         @ start clean at cache level 0
> flush_levels:
>         add     r2, r10, r10, lsr #1            @ work out 3x current cache level
>         mov     r1, r0, lsr r2                  @ extract cache type bits from clidr
> ...
> 
> Which do you think is better, the old version or the new version?

The old version, joking ;). I was not referring to the clean-up which
is a good thing on its own, we just have to agree on barriers semantics
in case the cache flushing level turns out to be 0, see above.

Thanks,
Lorenzo



More information about the linux-arm-kernel mailing list