[PATCH 2/3] ARM: mm: add support for HW coherent systems in PL310
Catalin Marinas
catalin.marinas at arm.com
Wed Apr 9 04:15:15 PDT 2014
On Tue, Apr 08, 2014 at 07:12:12PM +0100, Thomas Petazzoni wrote:
> On Tue, 8 Apr 2014 18:24:25 +0100, Catalin Marinas wrote:
> > > of_init = true;
> > > memcpy(&outer_cache, &data->outer_cache, sizeof(outer_cache));
> > > +
> > > + /*
> > > + * PL310 doesn't need an outer cache sync operation when the
> > > + * system is operating with hardware coherency enabled, as it
> > > + * is done directly in hardware.
> > > + */
> > > + if (of_device_is_compatible(np, "arm,pl310-cache") && is_coherent)
> > > + outer_cache.sync = NULL;
> > > +
> >
> > For this particular case, you can add a specific l2x0_of_data structure
> > with the right compatible string for your platform where
> > outer_cache.sync is NULL,
>
> In fact, I'm not sure using a separate compatible string is possible,
> because there are situations where the hardware platform may be I/O
> coherent, and some situations where it is not the case. For example, in
> the current kernel, the platform is I/O coherent when CONFIG_SMP is
> enabled, but not I/O coherent when CONFIG_SMP is disabled. And it's the
> same hardware platform, so same Device Tree in both cases.
>
> So using a different compatible string doesn't work here: we would have
> to adjust the compatible string depending on whether CONFIG_SMP
> or !CONFIG_SMP is used. Of course, this is doable at run time before
> probing the L2 cache driver, using a small quirk, but that doesn't
> sound really nice.
You could use #ifdef CONFIG_SMP around the compatible string but I
agree, it doesn't look nice.
> > assuming that dma_alloc_coherent() returns
> > Cacheable memory (e.g. your platform uses arm_coherent_dma_ops).
>
> We don't use arm_coherent_dma_ops, we have our own DMA operations, see
> arch/arm/mach-mvebu/coherency.c (in mainline, only the support for the
> PJ4B based Armada 370/XP is there, but the support for the Cortex-A9
> based Armada 375/38x has been posted at
> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-April/244916.html).
Slightly confusing. So for dma_alloc you can use Cacheable memory, but
for the from_device cases, you need some mvebu_hwcc_sync_io_barrier()
(and no cache maintenance). Doesn't it mean that barriers like rmb()
need such barrier as well (in combination with dma_alloc'ed buffers)?
> > > l2x0_init(l2x0_base, aux_val, aux_mask);
> > >
> > > return 0;
> > > }
> > > +
> > > +int __init l2x0_of_init(u32 aux_val, u32 aux_mask)
> > > +{
> > > + return l2x0_of_init_common(aux_val, aux_mask, false);
> > > +}
> > > +
> > > +int __init l2x0_of_init_coherent(u32 aux_val, u32 aux_mask)
> > > +{
> > > + return l2x0_of_init_common(aux_val, aux_mask, true);
> > > +}
> >
> > This could be a bit misleading. I would rather have a generic
> > pl310_data_dma_coherent structure (though even on coherent systems you
> > could still have drivers that prefer writecombine/NormalNC memory to
> > avoid thrashing the cache).
>
> I'm not sure to follow your suggestion here. What structure type would
> pl310_data_dma_coherent be? A "struct l2x0_of_data" like this:
>
> static const struct l2x0_of_data pl310_dma_coherent_data = {
> .setup = pl310_of_setup,
> .save = pl310_save,
> .outer_cache = {
> .resume = pl310_resume,
> .inv_range = l2x0_inv_range,
> .clean_range = l2x0_clean_range,
> .flush_range = l2x0_flush_range,
> .sync = NULL,
> .flush_all = l2x0_flush_all,
> .inv_all = l2x0_inv_all,
> .disable = l2x0_disable,
> },
> };
>
> static const struct of_device_id l2x0_ids[] __initconst = {
> [...]
> { .compatible = "arm,pl310-cache", .data = (void *)&pl310_data },
> { .compatible = "arm,pl310-cache-dma-coherent", .data = (void *)&pl310_dma_coherent_data },
> [...]
> };
>
> Is this what you have in mind? This looks ok to me, if we have a
> solution for the CONFIG_SMP vs. !CONFIG_SMP problem.
I don't have a nice way. Basically for this board, even if the coherency
was fully set up by firmware (which BTW should be the case for arm64, we
don't take mach-* directories ;)), it depends on the kernel build, I
guess this is because of the Shareable vs NonShareable mappings. The
best I can come up with is a conditionally compiled "marvell,..."
compatible string but your original approach is probably nicer (and as I
said, with the assumption that no driver uses writecombine/dmacoherent
memory for DMA).
--
Catalin
More information about the linux-arm-kernel
mailing list