[PATCH 0/3] ARM: mvebu: disable I/O coherency on !SMP

Russell King - ARM Linux linux at arm.linux.org.uk
Wed Jul 2 14:33:40 PDT 2014

On Wed, Jul 02, 2014 at 10:28:17PM +0200, Thomas Petazzoni wrote:
> Russell,
> On Wed, 2 Jul 2014 18:58:57 +0100, Russell King - ARM Linux wrote:
> > You say that, but Dove is PJ4B too (I forget whether it's PJ4B or not.)
> > The only way to be sure is to compare the ID numbers.  The AP510 in
> > Dove I have here has an ID register value of 0x560f5815.  This ties up
> > with the PJ4B entry in proc-v7.S:
> > 
> >         .type   __v7_pj4b_proc_info, #object
> > __v7_pj4b_proc_info:
> >         .long   0x560f5800
> >         .long   0xff0fff00
> >         __v7_proc __v7_pj4b_setup, proc_fns = pj4b_processor_functions
> >         .size   __v7_pj4b_proc_info, . - __v7_pj4b_proc_info
> > 
> > How does that compare to the ID you see on Armada 370 and XP?  (I've
> > no idea what IDs you get there...)  If the ID is different, then we
> > should be able to solve this pretty easily for these CPUs by creating
> > a new proc_info record as I think I have previously suggested.
> The ID for Armada 370 is 0x560f5811, but the last digit is the "Product
> Revision Number", so I'm not sure it can be used to distinguish Dove
> from Armada 370. However, I believe we could simply run Dove in a
> Write-Allocate cache policy.
> > With a separate record, we can set the S bit if we need to, and we can
> > also set the write-allocate mode too, now that the patch I sent you
> > is in mainline (which is something I definitely indicated along with
> > that patch.)
> > 
> > I've not really said anything new in this email which I haven't said
> > before, with the exception of stating what the ID is for the Dove SoC
> > I have.  That's the only ID I know, and I assume that those working on
> > mvebu have a better idea what ID numbers are found across all the
> > families.
> Russell, please again what I've asked numerous times. I already wrote
> the patches that creates specific proc_info structures for Armada 370
> and Armada XP to set the cache policy and S bit, as you suggested. But
> this is only setting the PMD flags, but doesn't touch the TTB flags,
> which remain only defined by ALT_UP/ALT_SMP conditional. And I remember
> Catalin saying that having PMD flags varying from the TTB flags could
> be a problem.

That's where having a separate processor struct comes in (which we
already have for the pj4b) and having a separate switch_mm which knows
to set the TTB flags to write allocate.

> Sorry to be a bit harsh here but you seem to assume that I'm an idiot
> who didn't read what you said, and you simply repeat again and again
> that your patch already solves the problem by allowing the proc_info
> structure to define the PMD flags. But you never answered my question
> about the TTB flags, which is the question I've been asking since
> several e-mails already.

You're assuming that I remember all the details each time... I know
that you have sent me several times all the details about what each
Armada device requires, and each time I've briefly read it and thought
"yes, I must get to that" and then never have done.  That's why we're
not making progress - I haven't had the time to read and digest your
messages properly.  For example, if you asked me right now what
Armada XP requires, whether it's SMP or not, I really couldn't tell

> diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
> index 6e1b2da..310834e 100644
> --- a/arch/arm/mm/proc-v7.S
> +++ b/arch/arm/mm/proc-v7.S
> @@ -483,14 +483,14 @@ __v7_ca9mp_proc_info:
>  __v7_pj4b_proc_info:
>  	.long	0x560f5810
>  	.long	0xff0ffff0
> -	__v7_proc __v7_pj4b_setup, proc_fns = pj4b_processor_functions
> +	__v7_proc __v7_pj4b_setup, proc_fns = pj4b_processor_functions, mm_mmuflags = PMD_SECT_WBWA
>  	.size	__v7_pj4b_proc_info, . - __v7_pj4b_proc_info
>  	.type   __v7_pj4b_mp_proc_info, #object
>  __v7_pj4b_mp_proc_info:
>  	.long	0x560f5840
>  	.long	0xff0ffff0
> -	__v7_proc __v7_pj4b_setup, proc_fns = pj4b_processor_functions
> +	__v7_proc __v7_pj4b_setup, proc_fns = pj4b_processor_functions, mm_mmuflags = PMD_SECT_WBWA|PMD_SECT_S
>  	.size	__v7_pj4b_mp_proc_info, . - __v7_pj4b_mp_proc_info
>  #endif

Right, so you're using pj4b_processor_functions here.

        globl_equ       cpu_pj4b_switch_mm,     cpu_v7_switch_mm
        globl_equ       cpu_pj4b_set_pte_ext,   cpu_v7_set_pte_ext
        globl_equ       cpu_pj4b_proc_init,     cpu_v7_proc_init

So we need to delete the first (switch_mm) so we can replace it.  I
think we may need a second set of processor functions to deal with
the MP version as well.  Most of these would need to be aliased
to the cpu_pj4b versions, except for the switch_mm one too.

You may wish to consider changing switch_mm to look something like the
following.  I'm assuming that for pj4b you just need WBWA for both
the inner and the outer, but not the shared bit - you'll need to change
that if it needs something else:

	b	1f
	orr	r0, r0, #TTB_FLAGS_SMP
	b	1f
        ALT_SMP(orr     r0, r0, #TTB_FLAGS_SMP)
        ALT_UP(orr      r0, r0, #TTB_FLAGS_UP)
1:      mmid    r1, r1                          @ get mm->context.id
        mov     r2, #0
#ifdef CONFIG_ARM_ERRATA_430973
        mcr     p15, 0, r2, c7, c5, 6           @ flush BTAC/BTB

That gets rid of the one in the switch_mm path.  The setup
function is much more icky.

I think __v7_setup needs splitting up - especially due to all the
errata in there.  I think we ought to add a whole load of processor
proc_info's - for ARM Cortex-A8, Cortex-A9, Cortex-A15 etc.  Each
one calls a separate setup function which runs only the appropriate
errata fixups for that core (which eliminates a load of testing in

This also means we can split the v7_ttb_setup macro and change it
to be per CPU.  (I'm not entirely convinced right now that TTB1
isn't initialised with an undefined value - I can't see where r8
is setup amongst the mess that this has turned into.

I'm not sure what this means for the 3level v7 code - that's something
I've never looked at except a brief glance when it went in.  The
v7_ttb_setup there looks rather complicated and might make the changes
I outlined above much harder.  That also isn't something I can play
around with because I've never been able to get Linux running on the
Versatile Express with the CPU card that would support that code.  We
may need to enlist Catalin's help to modify that code.

That's about the best guidance I can give this evening, and I hope
it helps you.

FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

More information about the linux-arm-kernel mailing list