[PATCH] sched: topology: make cache topology separate from cpu topology
王擎
wangqing at vivo.com
Fri Mar 11 00:18:34 PST 2022
>>
>>
>> >On Thu, 10 Mar 2022 at 13:59, Qing Wang <wangqing at vivo.com> wrote:
>> >>
>> >> From: Wang Qing <wangqing at vivo.com>
>> >>
>> >> Some architectures(e.g. ARM64), caches are implemented below:
>> >> cluster: ****** cluster 0 ***** ****** cluster 1 *****
>> >> core: 0 1 2 3 4 5 6 7
>> (add cache level 1) c0 c1 c2 c3 c4 c5 c6 c7
>> >> cache(Leveln): **cache0** **cache1** **cache2** **cache3**
>> (add cache level 3) *************share level 3 cache ***************
>> >> sd_llc_id(current): 0 0 0 0 4 4 4 4
>> >> sd_llc_id(should be): 0 0 2 2 4 4 6 6
>> >>
>> Here, n always be 2 in ARM64, but others are also possible.
>> core[0,1] form a complex(ARMV9), which share L2 cache, core[2,3] is the same.
>>
>> >> Caches and cpus have different topology, this causes cpus_share_cache()
>> >> return the wrong value, which will affect the CPU load balance.
>> >>
>> >What does your current scheduler topology look like?
>> >
>> >For CPU 0 to 3, do you have the below ?
>> >DIE [0 - 3] [4-7]
>> >MC [0] [1] [2] [3]
>>
>> The current scheduler topology consistent with CPU topology:
>> DIE [0-7]
>> MC [0-3] [4-7] (SD_SHARE_PKG_RESOURCES)
>> Most Android phones have this topology.
>> >
>> >But you would like something like below for cpu 0-1 instead ?
>> >DIE [0 - 3] [4-7]
>> >CLS [0 - 1] [2 - 3]
>> >MC [0] [1]
>> >
>> >with SD_SHARE_PKG_RESOURCES only set to MC level ?
>>
>> We don't change the current scheduler topology, but the
>> cache topology should be separated like below:
>
>The scheduler topology is not only cpu topology but a mixed of cpu and
>cache/memory cache topology
>
>> [0-7] (shared level 3 cache )
>> [0-1] [2-3][4-5][6-7] (shared level 2 cache )
>
>So you don't bother the intermediate cluster level which is even simpler.
>you have to modify generic arch topology so that cpu_coregroup_mask
>returns the correct cpu mask directly.
>
>You will notice a llc_sibling field that is currently used by acpi but
>not DT to return llc cpu mask
>
cpu_topology[].llc_sibling describe the last level cache of whole system,
not in the sched_domain.
in the above cache topology, llc_sibling is 0xff([0-7]) , it describes
the L3 cache sibling, but sd_llc_id describes the maximum shared cache
in sd, which should be [0-1] instead of [0-3].
Thanks,
Wang
More information about the linux-arm-kernel
mailing list