[PATCH] riscv: Add RISC-V svpbmt extension

Guo Ren guoren at kernel.org
Thu Sep 23 04:53:11 PDT 2021


On Thu, Sep 23, 2021 at 5:13 PM Anup Patel <anup at brainfault.org> wrote:
>
> On Thu, Sep 23, 2021 at 12:57 PM <guoren at kernel.org> wrote:
> >
> > From: Guo Ren <guoren at linux.alibaba.com>
> >
> > This patch follows the standard pure RISC-V Svpbmt extension in
> > privilege spec to solve the non-coherent SOC dma synchronization
> > issues.
> >
> > Here is the svpbmt PTE format:
> > | 63 | 62-61 | 60-8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
> >   N     MT     RSW    D   A   G   U   X   W   R   V
> >         ^
> >
> > Of the Reserved bits [63:54] in a leaf PTE, the high bit is already
> > allocated (as the N bit), so bits [62:61] are used as the MT (aka
> > MemType) field. This field specifies one of three memory types that
> > are close equivalents (or equivalent in effect) to the three main x86
> > and ARMv8 memory types - as shown in the following table.
> >
> > RISC-V
> > Encoding &
> > MemType     RISC-V Description
> > ----------  ------------------------------------------------
> > 00 - PMA    Normal Cacheable, No change to implied PMA memory type
> > 01 - NC     Non-cacheable, idempotent, weakly-ordered Main Memory
> > 10 - IO     Non-cacheable, non-idempotent, strongly-ordered I/O memory
> > 11 - Rsvd   Reserved for future standard use
> >
> > The standard protection_map [] needn't be modified because the "PMA"
> > type keeps the highest bits zero. The standard protection_map[] needn't
> > be modified. So the whole modification is in the arch/riscv pgtable and
> > using a global variable (__riscv_svpbmt) in the _PAGE_DMA_MASK/IO/NC
> > for pgprot_noncached (&writecombine). We also need _PAGE_CHG_MASK to
> > detect PFN than before.
> >
> > Devicetree: reuse "mmu-type" of cpu_section as a user interface to
> > enable the feature or not:
> >  - riscv,sv39,svpbmt
> >  - riscv,sv48,svpbmt
> >
> > Signed-off-by: Guo Ren <guoren at linux.alibaba.com>
> > Cc: Liu Shaohua <liush at allwinnertech.com>
> > Cc: Wei Fu <wefu at redhat.com>
> > Cc: Palmer Dabbelt <palmerdabbelt at google.com>
> > Cc: Christoph Hellwig <hch at lst.de>
> > Cc: Anup Patel <anup.patel at wdc.com>
> > Cc: Arnd Bergmann <arnd at arndb.de>
> > Cc: Atish Patra <atish.patra at wdc.com>
> > Cc: Drew Fustini <drew at beagleboard.org>
> > Cc: Wei Fu <wefu at redhat.com>
> > Cc: Wei Wu <lazyparser at gmail.com>
> > Cc: Chen-Yu Tsai <wens at csie.org>
> > Cc: Maxime Ripard <maxime at cerno.tech>
> > Cc: Daniel Lustig <dlustig at nvidia.com>
> > Cc: Greg Favor <gfavor at ventanamicro.com>
> > Cc: Andrea Mondelli <andrea.mondelli at huawei.com>
> > Cc: Jonathan Behrens <behrensj at mit.edu>
> > Cc: Xinhaoqu (Freddie) <xinhaoqu at huawei.com>
> > Cc: Bill Huffman <huffman at cadence.com>
> > Cc: Nick Kossifidis <mick at ics.forth.gr>
> > Cc: Allen Baum <allen.baum at esperantotech.com>
> > Cc: Josh Scheid <jscheid at ventanamicro.com>
> > Cc: Richard Trauben <rtrauben at gmail.com>
> > ---
> >  .../devicetree/bindings/riscv/cpus.yaml       |  2 +
> >  arch/riscv/include/asm/fixmap.h               |  2 +-
> >  arch/riscv/include/asm/pgtable-64.h           |  8 ++--
> >  arch/riscv/include/asm/pgtable-bits.h         | 46 ++++++++++++++++++-
> >  arch/riscv/include/asm/pgtable.h              | 39 ++++++++++++----
> >  arch/riscv/include/asm/processor.h            |  1 +
> >  arch/riscv/kernel/cpufeature.c                | 23 ++++++++++
> >  arch/riscv/kernel/setup.c                     |  2 +
> >  arch/riscv/mm/init.c                          |  5 ++
> >  9 files changed, 113 insertions(+), 15 deletions(-)
> >
> > diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml
> > index e534f6a7cfa1..1825cd8db0de 100644
> > --- a/Documentation/devicetree/bindings/riscv/cpus.yaml
> > +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
> > @@ -56,7 +56,9 @@ properties:
> >      enum:
> >        - riscv,sv32
> >        - riscv,sv39
> > +      - riscv,sv39,svpbmt
> >        - riscv,sv48
> > +      - riscv,sv48,svpbmt
> >        - riscv,none
>
> I think augmenting the "mmu-type" DT property is a good approach but the
> description of "mmu-type" DT property needs to say:
>
> "Identifies the MMU address translation mode and page based memory
> type used on ..."
Okay

>
> Also, DT bindings change is generally a separate patch so better to make
> a separate patch for this.
Okay

>
> >
> >    riscv,isa:
> > diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h
> > index 54cbf07fb4e9..5acd99d08e74 100644
> > --- a/arch/riscv/include/asm/fixmap.h
> > +++ b/arch/riscv/include/asm/fixmap.h
> > @@ -43,7 +43,7 @@ enum fixed_addresses {
> >         __end_of_fixed_addresses
> >  };
> >
> > -#define FIXMAP_PAGE_IO         PAGE_KERNEL
> > +#define FIXMAP_PAGE_IO         PAGE_IOREMAP
> >
> >  #define __early_set_fixmap     __set_fixmap
> >
> > diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h
> > index 228261aa9628..0b53ea67e91a 100644
> > --- a/arch/riscv/include/asm/pgtable-64.h
> > +++ b/arch/riscv/include/asm/pgtable-64.h
> > @@ -61,12 +61,14 @@ static inline void pud_clear(pud_t *pudp)
> >
> >  static inline pmd_t *pud_pgtable(pud_t pud)
> >  {
> > -       return (pmd_t *)pfn_to_virt(pud_val(pud) >> _PAGE_PFN_SHIFT);
> > +       return (pmd_t *)pfn_to_virt((pud_val(pud) & _PAGE_CHG_MASK)
> > +                                               >> _PAGE_PFN_SHIFT);
> >  }
> >
> >  static inline struct page *pud_page(pud_t pud)
> >  {
> > -       return pfn_to_page(pud_val(pud) >> _PAGE_PFN_SHIFT);
> > +       return pfn_to_page((pud_val(pud) & _PAGE_CHG_MASK)
> > +                                               >> _PAGE_PFN_SHIFT);
> >  }
> >
> >  static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)
> > @@ -76,7 +78,7 @@ static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)
> >
> >  static inline unsigned long _pmd_pfn(pmd_t pmd)
> >  {
> > -       return pmd_val(pmd) >> _PAGE_PFN_SHIFT;
> > +       return (pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT;
> >  }
> >
> >  #define mk_pmd(page, prot)    pfn_pmd(page_to_pfn(page), prot)
> > diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
> > index 2ee413912926..041fe4fdbafa 100644
> > --- a/arch/riscv/include/asm/pgtable-bits.h
> > +++ b/arch/riscv/include/asm/pgtable-bits.h
> > @@ -7,7 +7,7 @@
> >  #define _ASM_RISCV_PGTABLE_BITS_H
> >
> >  /*
> > - * PTE format:
> > + * rv32 PTE format:
> >   * | XLEN-1  10 | 9             8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
> >   *       PFN      reserved for SW   D   A   G   U   X   W   R   V
> >   */
> > @@ -24,6 +24,47 @@
> >  #define _PAGE_DIRTY     (1 << 7)    /* Set by hardware on any write */
> >  #define _PAGE_SOFT      (1 << 8)    /* Reserved for software */
> >
> > +#ifndef __ASSEMBLY__
> > +#ifdef CONFIG_64BIT
> > +/*
> > + * rv64 PTE format:
> > + * | 63 | 62 61 | 60 54 | 53  10 | 9             8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
> > + *   N      MT     RSV    PFN      reserved for SW   D   A   G   U   X   W   R   V
> > + * [62:61] Memory Type definitions:
> > + *  00 - PMA    Normal Cacheable, No change to implied PMA memory type
> > + *  01 - NC     Non-cacheable, idempotent, weakly-ordered Main Memory
> > + *  10 - IO     Non-cacheable, non-idempotent, strongly-ordered I/O memory
> > + *  11 - Rsvd   Reserved for future standard use
> > + */
> > +#define _PAGE_MT_MASK          ((u64)0x3 << 61)
> > +#define _PAGE_MT_PMA           ((u64)0x0 << 61)
> > +#define _PAGE_MT_NC            ((u64)0x1 << 61)
> > +#define _PAGE_MT_IO            ((u64)0x2 << 61)
> > +
> > +enum {
> > +       MT_PMA,
> > +       MT_NC,
> > +       MT_IO,
> > +       MT_MAX
> > +};
> > +
> > +extern struct __riscv_svpbmt_struct {
> > +       unsigned long mask;
> > +       unsigned long mt[MT_MAX];
> > +} __riscv_svpbmt;
> > +
> > +#define _PAGE_DMA_MASK         __riscv_svpbmt.mask
> > +#define _PAGE_DMA_PMA          __riscv_svpbmt.mt[MT_PMA]
> > +#define _PAGE_DMA_NC           __riscv_svpbmt.mt[MT_NC]
> > +#define _PAGE_DMA_IO           __riscv_svpbmt.mt[MT_IO]
> > +#else
> > +#define _PAGE_DMA_MASK         0
> > +#define _PAGE_DMA_PMA          0
> > +#define _PAGE_DMA_NC           0
> > +#define _PAGE_DMA_IO           0
> > +#endif /* CONFIG_64BIT */
> > +#endif /* __ASSEMBLY__ */
> > +
> >  #define _PAGE_SPECIAL   _PAGE_SOFT
> >  #define _PAGE_TABLE     _PAGE_PRESENT
> >
> > @@ -38,7 +79,8 @@
> >  /* Set of bits to preserve across pte_modify() */
> >  #define _PAGE_CHG_MASK  (~(unsigned long)(_PAGE_PRESENT | _PAGE_READ | \
> >                                           _PAGE_WRITE | _PAGE_EXEC |    \
> > -                                         _PAGE_USER | _PAGE_GLOBAL))
> > +                                         _PAGE_USER | _PAGE_GLOBAL |   \
> > +                                         _PAGE_DMA_MASK))
> >  /*
> >   * when all of R/W/X are zero, the PTE is a pointer to the next level
> >   * of the page table; otherwise, it is a leaf PTE.
> > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> > index 39b550310ec6..d07ba586c866 100644
> > --- a/arch/riscv/include/asm/pgtable.h
> > +++ b/arch/riscv/include/asm/pgtable.h
> > @@ -136,7 +136,8 @@
> >                                 | _PAGE_PRESENT \
> >                                 | _PAGE_ACCESSED \
> >                                 | _PAGE_DIRTY \
> > -                               | _PAGE_GLOBAL)
> > +                               | _PAGE_GLOBAL \
> > +                               | _PAGE_DMA_PMA)
> >
> >  #define PAGE_KERNEL            __pgprot(_PAGE_KERNEL)
> >  #define PAGE_KERNEL_READ       __pgprot(_PAGE_KERNEL & ~_PAGE_WRITE)
> > @@ -146,11 +147,9 @@
> >
> >  #define PAGE_TABLE             __pgprot(_PAGE_TABLE)
> >
> > -/*
> > - * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
> > - * change the properties of memory regions.
> > - */
> > -#define _PAGE_IOREMAP _PAGE_KERNEL
> > +#define _PAGE_IOREMAP  ((_PAGE_KERNEL & ~_PAGE_DMA_MASK) | _PAGE_DMA_IO)
> > +
> > +#define PAGE_IOREMAP           __pgprot(_PAGE_IOREMAP)
> >
> >  extern pgd_t swapper_pg_dir[];
> >
> > @@ -230,12 +229,12 @@ static inline unsigned long _pgd_pfn(pgd_t pgd)
> >
> >  static inline struct page *pmd_page(pmd_t pmd)
> >  {
> > -       return pfn_to_page(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
> > +       return pfn_to_page((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
> >  }
> >
> >  static inline unsigned long pmd_page_vaddr(pmd_t pmd)
> >  {
> > -       return (unsigned long)pfn_to_virt(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
> > +       return (unsigned long)pfn_to_virt((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
> >  }
> >
> >  static inline pte_t pmd_pte(pmd_t pmd)
> > @@ -251,7 +250,7 @@ static inline pte_t pud_pte(pud_t pud)
> >  /* Yields the page frame number (PFN) of a page table entry */
> >  static inline unsigned long pte_pfn(pte_t pte)
> >  {
> > -       return (pte_val(pte) >> _PAGE_PFN_SHIFT);
> > +       return ((pte_val(pte) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
> >  }
> >
> >  #define pte_page(x)     pfn_to_page(pte_pfn(x))
> > @@ -490,6 +489,28 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
> >         return ptep_test_and_clear_young(vma, address, ptep);
> >  }
> >
> > +#define pgprot_noncached pgprot_noncached
> > +static inline pgprot_t pgprot_noncached(pgprot_t _prot)
> > +{
> > +       unsigned long prot = pgprot_val(_prot);
> > +
> > +       prot &= ~_PAGE_DMA_MASK;
> > +       prot |= _PAGE_DMA_IO;
> > +
> > +       return __pgprot(prot);
> > +}
> > +
> > +#define pgprot_writecombine pgprot_writecombine
> > +static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
> > +{
> > +       unsigned long prot = pgprot_val(_prot);
> > +
> > +       prot &= ~_PAGE_DMA_MASK;
> > +       prot |= _PAGE_DMA_NC;
> > +
> > +       return __pgprot(prot);
> > +}
> > +
> >  /*
> >   * THP functions
> >   */
> > diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
> > index 021ed64ee608..92676156cbf6 100644
> > --- a/arch/riscv/include/asm/processor.h
> > +++ b/arch/riscv/include/asm/processor.h
> > @@ -70,6 +70,7 @@ struct device_node;
> >  int riscv_of_processor_hartid(struct device_node *node);
> >  int riscv_of_parent_hartid(struct device_node *node);
> >
> > +extern void riscv_svpbmt(void);
>
> You can drop this change.
>
> >  extern void riscv_fill_hwcap(void);
> >  extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
> >
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index d959d207a40d..4a2211c2c464 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -8,6 +8,7 @@
> >
> >  #include <linux/bitmap.h>
> >  #include <linux/of.h>
> > +#include <linux/pgtable.h>
> >  #include <asm/processor.h>
> >  #include <asm/hwcap.h>
> >  #include <asm/smp.h>
> > @@ -59,6 +60,28 @@ bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, int bit)
> >  }
> >  EXPORT_SYMBOL_GPL(__riscv_isa_extension_available);
> >
> > +void __init riscv_svpbmt(void)
>
> Make this static function and call it from riscv_fill_hwcap().
Okay

>
> > +{
> > +#ifdef CONFIG_64BIT
> > +       struct device_node *node;
> > +       const char *str;
> > +
> > +       for_each_of_cpu_node(node) {
> > +               if (of_property_read_string(node, "mmu-type", &str)) {
> > +                       continue;
> > +               }
> > +
> > +               if (!strncmp(str + 11, "svpbmt", 6)) {
> > +                       __riscv_svpbmt.mask       = _PAGE_MT_MASK;
> > +                       __riscv_svpbmt.mt[MT_PMA] = _PAGE_MT_PMA;
> > +                       __riscv_svpbmt.mt[MT_NC]  = _PAGE_MT_NC;
> > +                       __riscv_svpbmt.mt[MT_IO]  = _PAGE_MT_IO;
> > +                       break;
> > +               }
>
> I don't agree with this loop.
>
> The __riscv_svpbmt should be updated only when all CPU nodes
> have "svpmbt" in the "mmu-type" DT property.
Okay

>
> > +       }
> > +#endif
> > +}
> > +
> >  void __init riscv_fill_hwcap(void)
> >  {
> >         struct device_node *node;
> > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> > index 120b2f6f71bc..e7457113b45e 100644
> > --- a/arch/riscv/kernel/setup.c
> > +++ b/arch/riscv/kernel/setup.c
> > @@ -294,6 +294,8 @@ void __init setup_arch(char **cmdline_p)
> >         setup_smp();
> >  #endif
> >
> > +       riscv_svpbmt();
> > +
>
> You can drop this change based on above comment.
>
> >         riscv_fill_hwcap();
> >  }
> >
> > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> > index 7cb4f391d106..43b2e11fd3e0 100644
> > --- a/arch/riscv/mm/init.c
> > +++ b/arch/riscv/mm/init.c
> > @@ -905,3 +905,8 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
> >         return vmemmap_populate_basepages(start, end, node, NULL);
> >  }
> >  #endif
> > +
> > +#ifdef CONFIG_64BIT
> > +struct __riscv_svpbmt_struct __riscv_svpbmt __ro_after_init;
> > +EXPORT_SYMBOL(__riscv_svpbmt);
> > +#endif
> > --
> > 2.25.1
> >
>
> Appending "svpmb" suffix to "mmu-type" breaks print_mmu()
> in arch/riscv/kernel/cpu.c. Please update that as well.
Okay

>
> We will also need couple of Tested-by for existing boards.
>
> Regards,
> Anup

-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/



More information about the linux-riscv mailing list