[PATCH v3] riscv: mm: execute local TLB flush after populating vmemmap

Alexandre Ghiti alex at ghiti.fr
Tue Dec 12 00:04:25 PST 2023


On 12/12/2023 06:42, Vincent Chen wrote:
>
>
> On Mon, Dec 11, 2023 at 4:57 PM Alexandre Ghiti <alex at ghiti.fr> wrote:
>
>     Hi Vincent,
>
>     On 04/05/2023 11:13, Vincent Chen wrote:
>     > On Tue, May 2, 2023 at 8:16 PM Alexandre Ghiti
>     <alexghiti at rivosinc.com> wrote:
>     >> On Tue, May 2, 2023 at 2:18 AM Palmer Dabbelt
>     <palmer at dabbelt.com> wrote:
>     >>> On Sun, 16 Apr 2023 23:06:18 PDT (-0700),
>     vincent.chen at sifive.com wrote:
>     >>>> The spare_init() calls memmap_populate() many times to create
>     VA to PA
>     >>>> mapping for the VMEMMAP area, where all "struct page" are
>     located once
>     >>>> CONFIG_SPARSEMEM_VMEMMAP is defined. These "struct page" are
>     later
>     >>>> initialized in the zone_sizes_init() function. However,
>     during this
>     >>>> process, no sfence.vma instruction is executed for this
>     VMEMMAP area.
>     >>>> This omission may cause the hart to fail to perform page
>     table walk
>     >>>> because some data related to the address translation is
>     invisible to the
>     >>>> hart. To solve this issue, the local_flush_tlb_kernel_range()
>     is called
>     >>>> right after the spare_init() to execute a sfence.vma
>     instruction for the
>     >>>> VMEMMAP area, ensuring that all data related to the address
>     translation
>     >>>> is visible to the hart.
>     >>>>
>     >>>> Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem")
>     >>>> Signed-off-by: Vincent Chen <vincent.chen at sifive.com>
>     >>>> Reviewed-by: Alexandre Ghiti <alexghiti at rivosinc.com>
>     >>>> Reviewed-by: Andrew Jones <ajones at ventanamicro.com>
>     >>>> ---
>     >>>>   arch/riscv/include/asm/tlbflush.h | 7 +++++++
>     >>>>   arch/riscv/mm/init.c              | 5 +++++
>     >>>>   2 files changed, 12 insertions(+)
>     >>>>
>     >>>> diff --git a/arch/riscv/include/asm/tlbflush.h
>     b/arch/riscv/include/asm/tlbflush.h
>     >>>> index a09196f8de68..f9d3712bd93b 100644
>     >>>> --- a/arch/riscv/include/asm/tlbflush.h
>     >>>> +++ b/arch/riscv/include/asm/tlbflush.h
>     >>>> @@ -61,4 +61,11 @@ static inline void
>     flush_tlb_kernel_range(unsigned long start,
>     >>>>        flush_tlb_all();
>     >>>>   }
>     >>>>
>     >>>> +/* Flush a range of kernel pages without broadcasting */
>     >>>> +static inline void local_flush_tlb_kernel_range(unsigned
>     long start,
>     >>>> +    unsigned long end)
>     >>>> +{
>     >>>> +     local_flush_tlb_all();
>     >>> Why is this a global flush, istead of a range-based one?  We
>     should
>     >>> probably make the same change for flush_tlb_kernel_range(), though
>     >>> that's not a fix.
>     >> You're right, but the current implementation of flush_tlb_range
>     will
>     >> ultimately call flush_tlb_all if the size > PAGE_SIZE, which is
>     very
>     >> likely in this case. Anyway, I have a patchset almost ready that
>     >> improves the tlb flush routines, and I noted that I had to fix this
>     >> function once merged.
>     >>
>     > I agree with Alexandre that it's highly probable that the size
>     will be
>     > greater than PAGE_SIZE in this case. However, I must admit that this
>     > implementation is not suitable for the case where the size is less
>     > than or equal to PAGE_SIZE. I am willing to wait for Alexandre's
>     patch
>     > and rebase my patch on top of his If you believe it is a better
>     > approach.
>
>
>     My patchset was merged in 6.6, so you can now rebase this :) Let
>     me know
>     if I can help.
>
>     Thanks,
>
>     Alex
>
> Thank you for the notification. I will rebase my patch on the Linux 6.6.


Sorry, it actually landed in 6.7, not 6.6.

Thanks!

Alex


>
> Thanks,
> Vincent
>
>
>     >
>     >>>> +}
>     >>>> +
>     >>>>   #endif /* _ASM_RISCV_TLBFLUSH_H */
>     >>>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>     >>>> index 0f14f4a8d179..bcf365cbbcc1 100644
>     >>>> --- a/arch/riscv/mm/init.c
>     >>>> +++ b/arch/riscv/mm/init.c
>     >>>> @@ -1221,6 +1221,10 @@ void __init misc_mem_init(void)
>     >>>>        early_memtest(min_low_pfn << PAGE_SHIFT, max_low_pfn
>     << PAGE_SHIFT);
>     >>>>        arch_numa_init();
>     >>>>        sparse_init();
>     >>>> +#ifdef CONFIG_SPARSEMEM_VMEMMAP
>     >>>> +     /* The entire VMEMMAP region has been populated. Flush
>     TLB for this region */
>     >>>> +  local_flush_tlb_kernel_range(VMEMMAP_START, VMEMMAP_END);
>     >>>> +#endif
>     >>>>        zone_sizes_init();
>     >>>>        reserve_crashkernel();
>     >>>>        memblock_dump_all();
>     >>>> @@ -1230,6 +1234,7 @@ void __init misc_mem_init(void)
>     >>>>   int __meminit vmemmap_populate(unsigned long start,
>     unsigned long end, int node,
>     >>>>                               struct vmem_altmap *altmap)
>     >>>>   {
>     >>>> +     /* Defer the required TLB flush until the entire
>     VMEMMAP region has been populated */
>     >>>>        return vmemmap_populate_basepages(start, end, node, NULL);
>     >>>>   }
>     >>>>   #endif
>     > _______________________________________________
>     > linux-riscv mailing list
>     > linux-riscv at lists.infradead.org
>     > http://lists.infradead.org/mailman/listinfo/linux-riscv
>



More information about the linux-riscv mailing list