[PATCH v4 2/2] arm64: support batched/deferred tlb shootdown during page reclamation

Barry Song 21cnbao at gmail.com
Thu Oct 27 15:07:58 PDT 2022


On Thu, Oct 27, 2022 at 11:42 PM Anshuman Khandual
<anshuman.khandual at arm.com> wrote:
>
>
>
> On 9/28/22 05:53, Barry Song wrote:
> > On Tue, Sep 27, 2022 at 10:15 PM Yicong Yang <yangyicong at huawei.com> wrote:
> >>
> >> On 2022/9/27 14:16, Anshuman Khandual wrote:
> >>> [...]
> >>>
> >>> On 9/21/22 14:13, Yicong Yang wrote:
> >>>> +static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm)
> >>>> +{
> >>>> +    /* for small systems with small number of CPUs, TLB shootdown is cheap */
> >>>> +    if (num_online_cpus() <= 4)
> >>>
> >>> It would be great to have some more inputs from others, whether 4 (which should
> >>> to be codified into a macro e.g ARM64_NR_CPU_DEFERRED_TLB, or something similar)
> >>> is optimal for an wide range of arm64 platforms.
> >>>
> >
> > I have tested it on a 4-cpus and 8-cpus machine. but i have no machine
> > with 5,6,7
> > cores.
> > I saw improvement on 8-cpus machines and I found 4-cpus machines don't need
> > this patch.
> >
> > so it seems safe to have
> > if (num_online_cpus()  < 8)
> >
> >>
> >> Do you prefer this macro to be static or make it configurable through kconfig then
> >> different platforms can make choice based on their own situations? It maybe hard to
> >> test on all the arm64 platforms.
> >
> > Maybe we can have this default enabled on machines with 8 and more cpus and
> > provide a tlbflush_batched = on or off to allow users enable or
> > disable it according
> > to their hardware and products. Similar example: rodata=on or off.
>
> No, sounds bit excessive. Kernel command line options should not be added
> for every possible run time switch options.
>
> >
> > Hi Anshuman, Will,  Catalin, Andrew,
> > what do you think about this approach?
> >
> > BTW, haoxin mentioned another important user scenarios for tlb bach on arm64:
> > https://lore.kernel.org/lkml/393d6318-aa38-01ed-6ad8-f9eac89bf0fc@linux.alibaba.com/
> >
> > I do believe we need it based on the expensive cost of tlb shootdown in arm64
> > even by hardware broadcast.
>
> Alright, for now could we enable ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH selectively
> with CONFIG_EXPERT and for num_online_cpus()  > 8 ?

Sounds good to me. It is a good start to bring up tlb batched flush in
ARM64. Later on, we
might want to see it in both memory reclamation and migration.

Thanks
Barry



More information about the linux-arm-kernel mailing list