[PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids

Conor.Dooley at microchip.com Conor.Dooley at microchip.com
Tue Jul 5 12:06:17 PDT 2022


[Adding back the CC list from the original thread]

On 05/07/2022 13:27, Brice Goglin wrote:
> [You don't often get email from brice.goglin at inria.fr. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> Hello Conor
> 
> I am the main developer of hwloc [1] which is used by many people to
> detect the topology of servers. We're started to see some users of hwloc
> on RISC-V and we got some reports about the topology exposed by
> Linux/sysfs being wrong on some platforms.
> 
> For instance https://github.com/open-mpi/hwloc/issues/536 says HiFive
> Unmatched with SiFive Freedom U740 running Linux 5.15 exposes a single
> core with 4 threads instead of 4 cores, while StarFive VisionFive v1
> with JH7100 running 5.18.5 correctly exposes 2 cores.

And with Sudeep's patches applied I get (next-20220704):
# hwloc-calc -N core all
1
# hwloc-calc -N pu all
4
On a PolarFire SoC (so the same as a SiFive U540).
So unfortunately, these patches are not the fix you seek!

Wracked my brains for a bit, but could not see any differences
between the U740 and the JH7100. Culprit seems to be the lack
of a cpu-map node (which is only present in the downstream dt).

I've sent patches for the upstream devicetrees:
https://lore.kernel.org/linux-riscv/20220705190435.1790466-1-mail@conchuod.ie/

> 
> Can you tell me what's the overall status of the CPU/NUMA topology
> exposed in sysfs on RISC-V?

Heh, you've got the wrong man. I don't know.

> Does it depend a lot on the platform because
> device-tree and/or ACPI aren't always properly filled by vendors? Does
> it depend a lot on the Linux kernel version? Should I expect significant
> improvements for both in the next months?
> 
I don't know that either. This is why it's a good idea to preserve
the CC list!


> Thanks
> 
> Brice
> 
> [1] https://www.open-mpi.org/projects/hwloc/
> 
> 
> 
> Le 04/07/2022 à 17:10, Conor.Dooley at microchip.com a écrit :
>> On 04/07/2022 11:15, Sudeep Holla wrote:
>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>
>>> Hi Greg,
>>>
>>> Let me know if you prefer to pull the patches directly or prefer pull
>>> request. It has been in -next for a while now.
>>>
>>> Hi All,
>>>
>>> This version updates cacheinfo to populate and use the information from
>>> there for all the cache topology.
>>>
>>> This series intends to fix some discrepancies we have in the CPU topology
>>> parsing from the device tree /cpu-map node. Also this diverges from the
>>> behaviour on a ACPI enabled platform. The expectation is that both DT
>>> and ACPI enabled systems must present consistent view of the CPU topology.
>>>
>>> Currently we assign generated cluster count as the physical package identifier
>>> for each CPU which is wrong. The device tree bindings for CPU topology supports
>>> sockets to infer the socket or physical package identifier for a given CPU.
>>> Also we don't check if all the cores/threads belong to the same cluster before
>>> updating their sibling masks which is fine as we don't set the cluster id yet.
>>>
>>> These changes also assigns the cluster identifier as parsed from the device tree
>>> cluster nodes within /cpu-map without support for nesting of the clusters.
>>> Finally, it also add support for socket nodes in /cpu-map. With this the
>>> parsing of exact same information from ACPI PPTT and /cpu-map DT node
>>> aligns well.
>>>
>>> The only exception is that the last level cache id information can be
>>> inferred from the same ACPI PPTT while we need to parse CPU cache nodes
>>> in the device tree.
>> For DT + RISC-V on PolarFire SoC and SiFive fu540
>> Tested-by: Conor Dooley <conor.dooley at microchip.com>
>>
>> Anecdotally, v5 was tested on the !SMP D1 which worked fine when
>> CONFIG_SMP was enabled.
>>
>> Thanks,
>> Conor.
>>
>>>
>>> v5[5]->v6:
>>>          - Handled out of memory case in early detected correctly after
>>>            Conor reported boot failures on some RISC-V platform. Also
>>>            added a log to show up failure of early cacheinfo detection.
>>>          - Added "Remove the unused find_acpi_cpu_cache_topology()" which
>>>            was missed earlier and posted separately
>>>          - Added all the additional tags recieved
>>>
>>> v4[4]->v5[5]:
>>>          - Added all the tags recieved so far. Rafael has acked only change
>>>            in ACPI and Catalin has acked only change in arm64.
>>>          - Addressed all the typos pointed by Ionela and dropped the patch
>>>            removing the checks for invalid package id as discussed and update
>>>            depth in nested cluster warning check.
>>>
>>> v3[3]->v4[4]:
>>>          - Updated ACPI PPTT fw_token to use table offset instead of virtual
>>>            address as it could get changed for everytime it is mapped before
>>>            the global acpi_permanent_mmap is set
>>>          - Added warning for the topology with nested clusters
>>>          - Added update to cpu_clustergroup_mask so that introduction of
>>>            correct cluster_id doesn't break existing platforms by limiting
>>>            the span of clustergroup_mask(by Ionela)
>>>
>>> v2[2]->v3[3]:
>>>          - Dropped support to get the device node for the CPU's LLC
>>>          - Updated cacheinfo to support calling of detect_cache_attributes
>>>            early in smp_prepare_cpus stage
>>>          - Added support to check if LLC is valid and shared in the cacheinfo
>>>          - Used the same in arch_topology
>>>
>>> v1[1]->v2[2]:
>>>          - Updated ID validity check include all non-negative value
>>>          - Added support to get the device node for the CPU's last level cache
>>>          - Added support to build llc_sibling on DT platforms
>>>
>>> [1] https://lore.kernel.org/lkml/20220513095559.1034633-1-sudeep.holla@arm.com
>>> [2] https://lore.kernel.org/lkml/20220518093325.2070336-1-sudeep.holla@arm.com
>>> [3] https://lore.kernel.org/lkml/20220525081416.3306043-1-sudeep.holla@arm.com
>>> [4] https://lore.kernel.org/lkml/20220621192034.3332546-1-sudeep.holla@arm.com
>>> [5] https://lore.kernel.org/lkml/20220627165047.336669-1-sudeep.holla@arm.com
>>>
>>> Ionela Voinescu (1):
>>>    arch_topology: Limit span of cpu_clustergroup_mask()
>>>
>>> Sudeep Holla (20):
>>>    ACPI: PPTT: Use table offset as fw_token instead of virtual address
>>>    cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
>>>    cacheinfo: Add helper to access any cache index for a given CPU
>>>    cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
>>>    cacheinfo: Add support to check if last level cache(LLC) is valid or shared
>>>    cacheinfo: Allow early detection and population of cache attributes
>>>    cacheinfo: Use cache identifiers to check if the caches are shared if available
>>>    cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability
>>>    arch_topology: Add support to parse and detect cache attributes
>>>    arch_topology: Use the last level cache information from the cacheinfo
>>>    arm64: topology: Remove redundant setting of llc_id in CPU topology
>>>    arch_topology: Drop LLC identifier stash from the CPU topology
>>>    arch_topology: Set thread sibling cpumask only within the cluster
>>>    arch_topology: Check for non-negative value rather than -1 for IDs validity
>>>    arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
>>>    arch_topology: Don't set cluster identifier as physical package identifier
>>>    arch_topology: Set cluster identifier in each core/thread from /cpu-map
>>>    arch_topology: Add support for parsing sockets in /cpu-map
>>>    arch_topology: Warn that topology for nested clusters is not supported
>>>    ACPI: Remove the unused find_acpi_cpu_cache_topology()
>>>
>>>   arch/arm64/kernel/topology.c  |  14 ----
>>>   drivers/acpi/pptt.c           |  40 +---------
>>>   drivers/base/arch_topology.c  | 102 ++++++++++++++++++------
>>>   drivers/base/cacheinfo.c      | 143 ++++++++++++++++++++++------------
>>>   include/linux/acpi.h          |   5 --
>>>   include/linux/arch_topology.h |   1 -
>>>   include/linux/cacheinfo.h     |   3 +
>>>   7 files changed, 175 insertions(+), 133 deletions(-)
>>>
>>> -- 
>>> 2.37.0
>>>



More information about the linux-arm-kernel mailing list