[RFC PATCH v2 2/2] soc: renesas: Add L2 cache management for RZ/Five SoC

Guo Ren guoren at kernel.org
Tue Oct 11 06:10:29 PDT 2022


On Tue, Oct 11, 2022 at 5:39 PM Lad, Prabhakar
<prabhakar.csengg at gmail.com> wrote:
>
> Hi Guo,
>
> On Thu, Oct 6, 2022 at 1:59 AM Guo Ren <guoren at kernel.org> wrote:
> >
> > On Wed, Oct 5, 2022 at 11:03 PM Lad, Prabhakar
> > <prabhakar.csengg at gmail.com> wrote:
> > >
> > > Hi Guo,
> > >
> > > On Wed, Oct 5, 2022 at 3:23 PM Guo Ren <guoren at kernel.org> wrote:
> > > >
> > > > On Wed, Oct 5, 2022 at 8:54 PM Lad, Prabhakar
> > > > <prabhakar.csengg at gmail.com> wrote:
> > > > >
> > > > > Hi Guo,
> > > > >
> > > > > On Wed, Oct 5, 2022 at 2:29 AM Guo Ren <guoren at kernel.org> wrote:
> > > > > >
> > > > > > On Tue, Oct 4, 2022 at 6:32 AM Prabhakar <prabhakar.csengg at gmail.com> wrote:
> > > > > > >
> > > > > > > From: Lad Prabhakar <prabhakar.mahadev-lad.rj at bp.renesas.com>
> > > > > > >
> > > > > > > On the AX45MP core, cache coherency is a specification option so it may
> > > > > > > not be supported. In this case DMA will fail. As a workaround, firstly we
> > > > > > > allocate a global dma coherent pool from which DMA allocations are taken
> > > > > > > and marked as non-cacheable + bufferable using the PMA region as specified
> > > > > > > in the device tree. Synchronization callbacks are implemented to
> > > > > > > synchronize when doing DMA transactions.
> > > > > > >
> > > > > > > The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> > > > > > > block that allows dynamic adjustment of memory attributes in the runtime.
> > > > > > > It contains a configurable amount of PMA entries implemented as CSR
> > > > > > > registers to control the attributes of memory locations in interest.
> > > > > > >
> > > > > > > Below are the memory attributes supported:
> > > > > > > * Device, Non-bufferable
> > > > > > > * Device, bufferable
> > > > > > > * Memory, Non-cacheable, Non-bufferable
> > > > > > > * Memory, Non-cacheable, Bufferable
> > > > > > > * Memory, Write-back, No-allocate
> > > > > > > * Memory, Write-back, Read-allocate
> > > > > > > * Memory, Write-back, Write-allocate
> > > > > > > * Memory, Write-back, Read and Write-allocate
> > > > > > Seems Svpbmt's PMA, IO, and NC wouldn't fit your requirements, could
> > > > > > give a map list of the types of Svpbmt? And give out what you needed,
> > > > > > but Svpbmt can't.
> > > > > >
> > > > > Sorry I didn't get what you meant here, could you please elaborate.
> > > > I know there is no pbmt in AX45MP, I am just curious how many physical
> > > > memory attributes you would use in linux? It seems only one type used
> > > > in the series:
> > > > cpu_nocache_area_set -> sbi_ecall(SBI_EXT_ANDES,
> > > > SBI_EXT_ANDES_SET_PMA, offset, vaddr, size, entry_id, 0, 0);
> > > >
> > > Yes, currently we only use "Memory, Non-cacheable, Bufferable". I was
> > > wondering if we could send these options as flags from DT something
> > > like below so that it's not hard coded in the code.
> > >
> > > /* PMA config */
> > > #define AX45MP_PMACFG_ETYP                GENMASK(1, 0)
> > > /* OFF: PMA entry is disabled */
> > > #define AX45MP_PMACFG_ETYP_DISABLED            0
> > > /* Naturally aligned power of 2 region */
> > > #define AX45MP_PMACFG_ETYP_NAPOT            3
> > >
> > > #define AX45MP_PMACFG_MTYP                GENMASK(5, 2)
> > > /* Device, Non-bufferable */
> > > #define AX45MP_PMACFG_MTYP_DEV_NON_BUF            (0 << 2)
> > > /* Device, bufferable */
> > > #define AX45MP_PMACFG_MTYP_DEV_BUF            (1 << 2)
> > > /* Memory, Non-cacheable, Non-bufferable */
> > > #define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_NON_BUF    (2 << 2)
> > > /* Memory, Non-cacheable, Bufferable */
> > > #define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF        (3 << 2)
> > > /* Memory, Write-back, No-allocate */
> > > #define AX45MP_PMACFG_MTYP_MEM_WB_NA            (8 << 2)
> > > /* Memory, Write-back, Read-allocate */
> > > #define AX45MP_PMACFG_MTYP_MEM_WB_RA            (9 << 2)
> > > /* Memory, Write-back, Write-allocate */
> > > #define AX45MP_PMACFG_MTYP_MEM_WB_WA            (10 << 2)
> > > /* Memory, Write-back, Read and Write-allocate */
> > > #define AX45MP_PMACFG_MTYP_MEM_WB_R_WA            (11 << 2)
> > >
> > > /* AMO instructions are supported */
> > > #define AX45MP_PMACFG_NAMO_AMO_SUPPORT            (0 << 6)
> > > /* AMO instructions are not supported */
> > > #define AX45MP_PMACFG_NAMO_AMO_NO_SUPPORT        (1 << 6)
> > >
> > >
> > >                 pma-regions = <0x0 0x00000000 0x0 0x10000000 0x0
> > > AX45MP_PMACFG_ETYP_NAPOT |  AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF |
> > > AX45MP_PMACFG_NAMO_AMO_SUPPORT>,
> > >                               <0x0 0x10000000 0x0 0x04000000 0x0
> > > AX45MP_PMACFG_ETYP_NAPOT |  AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF |
> > > AX45MP_PMACFG_NAMO_AMO_SUPPORT >,
> > >                               <0x0 0x20000000 0x0 0x10000000 0x0
> > > AX45MP_PMACFG_ETYP_NAPOT |  AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF |
> > > AX45MP_PMACFG_NAMO_AMO_SUPPORT>,
> > >                               <0x0 0x58000000 0x0 0x08000000 0x0
> > > AX45MP_PMACFG_ETYP_NAPOT |  AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF |
> > > AX45MP_PMACFG_NAMO_AMO_SUPPORT>;
> > >
> > > Does the above sound good?
> > I've no idea. But for working around, I would give Acked-by.
> >
> > >
> > > > I'm not sure how you make emmc/usb/gmac's dma ctrl desc work around
> > > > without pbmt when they don't have cache coherency protocol. Do you
> > > > need to inject dma_sync for desc synchronization? What's the effect of
> > > > dynamic PMA in the patch series?
> > > >
> > > Currently we have setup the pma regions as below:
> > >
> > > l2cache: cache-controller at 13400000 {
> > >                 compatible = "andestech,ax45mp-cache", "cache";
> > >                 cache-size = <0x40000>;
> > >                 cache-line-size = <64>;
> > >                 cache-sets = <1024>;
> > >                 cache-unified;
> > >                 reg = <0x0 0x13400000 0x0 0x100000>;
> > >                 pma-regions = <0x0 0x00000000 0x0 0x10000000 0x0 0xf>,
> > >                               <0x0 0x10000000 0x0 0x04000000 0x0 0xf>,
> > >                               <0x0 0x20000000 0x0 0x10000000 0x0 0xf>,
> > >                               <0x0 0x58000000 0x0 0x08000000 0x0 0xf>;
> > >                 interrupts = <SOC_PERIPHERAL_IRQ(476, IRQ_TYPE_LEVEL_HIGH)>;
> > >         };
> > >
> > > The last pma-regions entry 0x58000000 is a DDR location this memory
> > > locations is marked as shared DMA pool with below in DT,
> > >
> > >     reserved-memory {
> > >         #address-cells = <2>;
> > >         #size-cells = <2>;
> > >         ranges;
> > >
> > >         reserved: linux,cma at 58000000 {
> > >             compatible = "shared-dma-pool";
> > >             no-map;
> > >             linux,dma-default;
> > >             reg = <0x0 0x58000000 0x0 0x08000000>;
> > >         };
> > >     };
> > >
> > > And for ARCH_R9A07G043 we automatically select DMA_GLOBAL_POOL, so the
> > > IP blocks (emmc/usb/gmac's) requesting DMA'able memory will
> > > automatically fall into this region which is non-cacheable but
> > > bufferable (set in PMA) and rest everything is taken care by clean and
> > > flush callbacks. We dont have  inject dma_sync for desc
> > > synchronization for existing drivers (which are shared with Renesas
> > > RZ/G2L family)
> > Better than I thought :). The "non-cacheable but c" is "weak
> > order," also raising the bufferable signal of AXI transactions. Right?
> Yes, I have confirmed from the HW team it does raise bufferable signal
> of AXI transactions. So far with the drivers (ETH/USB/DMAC) we haven't
> seen issues so far.
>
> Do you foresee any issues?
That depends on you interconnect design, most of the simple
interconnects would ignore bufferable. Some NoC interconnects would
buffer the transactions, which means data would be buffered in
interconnects after CPU store instruction retired. If the CPU kicks
the dma working with an IO reg write, hw may not guarantee the orders
of the last data written and dma IO reg kick start. Then dma may lose
the data.

Not only for the interconnect, but also "noncacheable + weak order"
would cause data to stay in the CPU store buffer after the store
instruction retired.

>
> Cheers,
> Prabhakar



-- 
Best Regards
 Guo Ren



More information about the linux-riscv mailing list