[PATCH] sched: topology: make cache topology separate from cpu topology

Vincent Guittot vincent.guittot at linaro.org
Fri Mar 11 00:25:45 PST 2022


On Fri, 11 Mar 2022 at 09:18, 王擎 <wangqing at vivo.com> wrote:
>
>
> >>
> >>
> >> >On Thu, 10 Mar 2022 at 13:59, Qing Wang <wangqing at vivo.com> wrote:
> >> >>
> >> >> From: Wang Qing <wangqing at vivo.com>
> >> >>
> >> >> Some architectures(e.g. ARM64), caches are implemented below:
> >> >> cluster:                     ****** cluster 0 *****      ****** cluster 1 *****
> >> >> core:                         0      1          2      3          4      5           6      7
> >> (add cache level 1)        c0    c1        c2    c3         c4    c5         c6    c7
> >> >> cache(Leveln):         **cache0**  **cache1**  **cache2**  **cache3**
> >> (add cache level 3)        *************share level 3 cache ***************
> >> >> sd_llc_id(current):     0      0          0      0          4      4           4      4
> >> >> sd_llc_id(should be): 0      0          2      2          4      4           6      6
> >> >>
> >> Here, n always be 2 in ARM64, but others are also possible.
> >> core[0,1] form a complex(ARMV9),  which share L2 cache, core[2,3] is the same.
> >>
> >> >> Caches and cpus have different topology, this causes cpus_share_cache()
> >> >> return the wrong value, which will affect the CPU load balance.
> >> >>
> >> >What does your current scheduler topology  look like?
> >> >
> >> >For CPU 0 to 3, do you have the below ?
> >> >DIE [0     -     3] [4-7]
> >> >MC  [0] [1] [2] [3]
> >>
> >> The current scheduler topology consistent with CPU topology:
> >> DIE  [0-7]
> >> MC  [0-3] [4-7]  (SD_SHARE_PKG_RESOURCES)
> >> Most Android phones have this topology.
> >> >
> >> >But you would like something like below for cpu 0-1 instead ?
> >> >DIE [0     -     3] [4-7]
> >> >CLS [0 - 1] [2 - 3]
> >> >MC  [0] [1]
> >> >
> >> >with SD_SHARE_PKG_RESOURCES only set to MC level ?
> >>
> >> We don't change the current scheduler topology, but the
> >> cache topology should be separated like below:
> >
> >The scheduler topology is not only cpu topology but a mixed of cpu and
> >cache/memory cache topology
> >
> >> [0-7]                          (shared level 3 cache )
> >> [0-1] [2-3][4-5][6-7]   (shared level 2 cache )
> >
> >So you don't  bother the intermediate cluster level which is even simpler.
> >you have to modify generic arch topology so that cpu_coregroup_mask
> >returns the correct cpu mask directly.
> >
> >You will notice a llc_sibling field that is currently used by acpi but
> >not DT to return llc cpu mask
> >
> cpu_topology[].llc_sibling describe the last level cache of whole system,
> not in the sched_domain.
>
> in the above cache topology, llc_sibling is 0xff([0-7]) , it describes

If llc_sibling was 0xff([0-7] on your system, you would have only one level:
MC[0-7]

> the L3 cache sibling, but sd_llc_id describes the maximum shared cache
> in sd, which should be [0-1] instead of [0-3].

sd_llc_id describes the last sched_domain with SD_SHARE_PKG_RESOURCES.
If you want llc to be [0-3] make sure that the
sched_domain_topology_level array returns the correct cpumask with
this flag


>
> Thanks,
> Wang



More information about the linux-arm-kernel mailing list