[PATCH v3] riscv: mm: execute local TLB flush after populating vmemmap
Alexandre Ghiti
alex at ghiti.fr
Tue Dec 12 00:04:25 PST 2023
On 12/12/2023 06:42, Vincent Chen wrote:
>
>
> On Mon, Dec 11, 2023 at 4:57 PM Alexandre Ghiti <alex at ghiti.fr> wrote:
>
> Hi Vincent,
>
> On 04/05/2023 11:13, Vincent Chen wrote:
> > On Tue, May 2, 2023 at 8:16 PM Alexandre Ghiti
> <alexghiti at rivosinc.com> wrote:
> >> On Tue, May 2, 2023 at 2:18 AM Palmer Dabbelt
> <palmer at dabbelt.com> wrote:
> >>> On Sun, 16 Apr 2023 23:06:18 PDT (-0700),
> vincent.chen at sifive.com wrote:
> >>>> The spare_init() calls memmap_populate() many times to create
> VA to PA
> >>>> mapping for the VMEMMAP area, where all "struct page" are
> located once
> >>>> CONFIG_SPARSEMEM_VMEMMAP is defined. These "struct page" are
> later
> >>>> initialized in the zone_sizes_init() function. However,
> during this
> >>>> process, no sfence.vma instruction is executed for this
> VMEMMAP area.
> >>>> This omission may cause the hart to fail to perform page
> table walk
> >>>> because some data related to the address translation is
> invisible to the
> >>>> hart. To solve this issue, the local_flush_tlb_kernel_range()
> is called
> >>>> right after the spare_init() to execute a sfence.vma
> instruction for the
> >>>> VMEMMAP area, ensuring that all data related to the address
> translation
> >>>> is visible to the hart.
> >>>>
> >>>> Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem")
> >>>> Signed-off-by: Vincent Chen <vincent.chen at sifive.com>
> >>>> Reviewed-by: Alexandre Ghiti <alexghiti at rivosinc.com>
> >>>> Reviewed-by: Andrew Jones <ajones at ventanamicro.com>
> >>>> ---
> >>>> arch/riscv/include/asm/tlbflush.h | 7 +++++++
> >>>> arch/riscv/mm/init.c | 5 +++++
> >>>> 2 files changed, 12 insertions(+)
> >>>>
> >>>> diff --git a/arch/riscv/include/asm/tlbflush.h
> b/arch/riscv/include/asm/tlbflush.h
> >>>> index a09196f8de68..f9d3712bd93b 100644
> >>>> --- a/arch/riscv/include/asm/tlbflush.h
> >>>> +++ b/arch/riscv/include/asm/tlbflush.h
> >>>> @@ -61,4 +61,11 @@ static inline void
> flush_tlb_kernel_range(unsigned long start,
> >>>> flush_tlb_all();
> >>>> }
> >>>>
> >>>> +/* Flush a range of kernel pages without broadcasting */
> >>>> +static inline void local_flush_tlb_kernel_range(unsigned
> long start,
> >>>> + unsigned long end)
> >>>> +{
> >>>> + local_flush_tlb_all();
> >>> Why is this a global flush, istead of a range-based one? We
> should
> >>> probably make the same change for flush_tlb_kernel_range(), though
> >>> that's not a fix.
> >> You're right, but the current implementation of flush_tlb_range
> will
> >> ultimately call flush_tlb_all if the size > PAGE_SIZE, which is
> very
> >> likely in this case. Anyway, I have a patchset almost ready that
> >> improves the tlb flush routines, and I noted that I had to fix this
> >> function once merged.
> >>
> > I agree with Alexandre that it's highly probable that the size
> will be
> > greater than PAGE_SIZE in this case. However, I must admit that this
> > implementation is not suitable for the case where the size is less
> > than or equal to PAGE_SIZE. I am willing to wait for Alexandre's
> patch
> > and rebase my patch on top of his If you believe it is a better
> > approach.
>
>
> My patchset was merged in 6.6, so you can now rebase this :) Let
> me know
> if I can help.
>
> Thanks,
>
> Alex
>
> Thank you for the notification. I will rebase my patch on the Linux 6.6.
Sorry, it actually landed in 6.7, not 6.6.
Thanks!
Alex
>
> Thanks,
> Vincent
>
>
> >
> >>>> +}
> >>>> +
> >>>> #endif /* _ASM_RISCV_TLBFLUSH_H */
> >>>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> >>>> index 0f14f4a8d179..bcf365cbbcc1 100644
> >>>> --- a/arch/riscv/mm/init.c
> >>>> +++ b/arch/riscv/mm/init.c
> >>>> @@ -1221,6 +1221,10 @@ void __init misc_mem_init(void)
> >>>> early_memtest(min_low_pfn << PAGE_SHIFT, max_low_pfn
> << PAGE_SHIFT);
> >>>> arch_numa_init();
> >>>> sparse_init();
> >>>> +#ifdef CONFIG_SPARSEMEM_VMEMMAP
> >>>> + /* The entire VMEMMAP region has been populated. Flush
> TLB for this region */
> >>>> + local_flush_tlb_kernel_range(VMEMMAP_START, VMEMMAP_END);
> >>>> +#endif
> >>>> zone_sizes_init();
> >>>> reserve_crashkernel();
> >>>> memblock_dump_all();
> >>>> @@ -1230,6 +1234,7 @@ void __init misc_mem_init(void)
> >>>> int __meminit vmemmap_populate(unsigned long start,
> unsigned long end, int node,
> >>>> struct vmem_altmap *altmap)
> >>>> {
> >>>> + /* Defer the required TLB flush until the entire
> VMEMMAP region has been populated */
> >>>> return vmemmap_populate_basepages(start, end, node, NULL);
> >>>> }
> >>>> #endif
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
>
More information about the linux-riscv
mailing list