[PATCH v3] topology: make core_mask include at least cluster_siblings

Darren Hart darren at os.amperecomputing.com
Wed Mar 9 10:26:24 PST 2022


On Wed, Mar 09, 2022 at 01:50:07PM +0100, Dietmar Eggemann wrote:
> On 08/03/2022 18:49, Darren Hart wrote:
> > On Tue, Mar 08, 2022 at 05:03:07PM +0100, Dietmar Eggemann wrote:
> >> On 08/03/2022 12:04, Vincent Guittot wrote:
> >>> On Tue, 8 Mar 2022 at 11:30, Will Deacon <will at kernel.org> wrote:
> 
> [...]
> 
> >> IMHO, if core_mask weight is 1, MC will be removed/degenerated anyway.
> >>
> >> This is what I get on my Ampere Altra (I guess I don't have the ACPI
> >> changes which would let to a CLS sched domain):
> >>
> >> # cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
> >> DIE
> >> NUMA
> >> root at oss-altra01:~# zcat /proc/config.gz | grep SCHED_CLUSTER
> >> CONFIG_SCHED_CLUSTER=y
> > 
> > I'd like to follow up on this. Would you share your dmidecode BIOS
> > Information section?
> 
> # dmidecode -t 0
> # dmidecode 3.2
> Getting SMBIOS data from sysfs.
> SMBIOS 3.2.0 present.
> 
> Handle 0x0000, DMI type 0, 26 bytes
> BIOS Information
> 	Vendor: Ampere(TM)
> 	Version: 0.9.20200724
> 	Release Date: 2020/07/24
> 	ROM Size: 7680 kB
> 	Characteristics:
> 		PCI is supported
> 		BIOS is upgradeable
> 		Boot from CD is supported
> 		Selectable boot is supported
> 		ACPI is supported
> 		UEFI is supported
> 	BIOS Revision: 5.15
> 	Firmware Revision: 0.6
> 

Thank you, I'm following internally and will get with you.

> > Which kernel version?
> 
> v5.17-rc5
> 
> [...]
> 
> >>> I would not say that I'm happy because this solution skews the core
> >>> cpu mask in order to abuse the scheduler so that it will remove a
> >>> wrong but useless level when it will build its domains.
> >>> But this works so as long as the maintainer are happy, I'm fine
> > 
> > I did explore the other options and they added considerably more
> > complexity without much benefit in my view. I prefer this option which
> > maintains the cpu_topology as described by the platform, and maps it
> > into something that suits the current scheduler abstraction. I agree
> > there is more work to be done here and intend to continue with it.
> > 
> >> I do not have any better idea than this tweak here either in case the
> >> platform can't provide a cleaner setup.
> > 
> > I'd argue The platform is describing itself accurately in ACPI PPTT
> > terms. The topology doesn't fit nicely within the kernel abstractions
> > today. This is an area where I hope to continue to improve things going
> > forward.
> 
> I see. And I assume lying about SCU/LLC boundaries in ACPI is not an
> option since it messes up /sys/devices/system/cpu/cpu0/cache/index*/.
> 
> [...]

I'm not aware of a way to accurately describe the SCU topology in the PPTT, and
the risk we run with lying about LLC topology is that lie has to be comprehended
by all OSes and not conflict with other lies people may ask for. In general, I
think it is preferable and more maintainable to describe the topology as
accurately and honestly as we can within the existing platform mechanisms (PPTT,
HMAT, etc) and work on the higher level abstractions to accommodate a broader
set of topologies as they emerge (as well as working to more fully describe the
topology with new platform level mechanisms as needed).

As I mentioned, I intend to continue looking in to how to improve the current
abstractions. For now, it sounds like we have agreement that this patch can be
merged to address the BUG?

Thanks all,

-- 
Darren Hart
Ampere Computing / OS and Kernel



More information about the linux-arm-kernel mailing list