[RFC PATCH v5 1/4] topology: Represent clusters of CPUs within a die

Song Bao Hua (Barry Song) song.bao.hua at hisilicon.com
Tue Apr 20 04:30:53 BST 2021



> -----Original Message-----
> From: Greg KH [mailto:gregkh at linuxfoundation.org]
> Sent: Friday, March 19, 2021 11:02 PM
> To: Jonathan Cameron <jonathan.cameron at huawei.com>
> Cc: Song Bao Hua (Barry Song) <song.bao.hua at hisilicon.com>;
> tim.c.chen at linux.intel.com; catalin.marinas at arm.com; will at kernel.org;
> rjw at rjwysocki.net; vincent.guittot at linaro.org; bp at alien8.de;
> tglx at linutronix.de; mingo at redhat.com; lenb at kernel.org; peterz at infradead.org;
> dietmar.eggemann at arm.com; rostedt at goodmis.org; bsegall at google.com;
> mgorman at suse.de; msys.mizuma at gmail.com; valentin.schneider at arm.com;
> juri.lelli at redhat.com; mark.rutland at arm.com; sudeep.holla at arm.com;
> aubrey.li at linux.intel.com; linux-arm-kernel at lists.infradead.org;
> linux-kernel at vger.kernel.org; linux-acpi at vger.kernel.org; x86 at kernel.org;
> xuwei (O) <xuwei5 at huawei.com>; Zengtao (B) <prime.zeng at hisilicon.com>;
> guodong.xu at linaro.org; yangyicong <yangyicong at huawei.com>; Liguozhu (Kenneth)
> <liguozhu at hisilicon.com>; linuxarm at openeuler.org; hpa at zytor.com
> Subject: Re: [RFC PATCH v5 1/4] topology: Represent clusters of CPUs within
> a die
> 
> On Fri, Mar 19, 2021 at 09:36:16AM +0000, Jonathan Cameron wrote:
> > On Fri, 19 Mar 2021 06:57:08 +0000
> > "Song Bao Hua (Barry Song)" <song.bao.hua at hisilicon.com> wrote:
> >
> > > > -----Original Message-----
> > > > From: Greg KH [mailto:gregkh at linuxfoundation.org]
> > > > Sent: Friday, March 19, 2021 7:35 PM
> > > > To: Song Bao Hua (Barry Song) <song.bao.hua at hisilicon.com>
> > > > Cc: tim.c.chen at linux.intel.com; catalin.marinas at arm.com;
> will at kernel.org;
> > > > rjw at rjwysocki.net; vincent.guittot at linaro.org; bp at alien8.de;
> > > > tglx at linutronix.de; mingo at redhat.com; lenb at kernel.org;
> peterz at infradead.org;
> > > > dietmar.eggemann at arm.com; rostedt at goodmis.org; bsegall at google.com;
> > > > mgorman at suse.de; msys.mizuma at gmail.com; valentin.schneider at arm.com;
> Jonathan
> > > > Cameron <jonathan.cameron at huawei.com>; juri.lelli at redhat.com;
> > > > mark.rutland at arm.com; sudeep.holla at arm.com; aubrey.li at linux.intel.com;
> > > > linux-arm-kernel at lists.infradead.org; linux-kernel at vger.kernel.org;
> > > > linux-acpi at vger.kernel.org; x86 at kernel.org; xuwei (O)
> <xuwei5 at huawei.com>;
> > > > Zengtao (B) <prime.zeng at hisilicon.com>; guodong.xu at linaro.org;
> yangyicong
> > > > <yangyicong at huawei.com>; Liguozhu (Kenneth) <liguozhu at hisilicon.com>;
> > > > linuxarm at openeuler.org; hpa at zytor.com
> > > > Subject: Re: [RFC PATCH v5 1/4] topology: Represent clusters of CPUs within
> > > > a die
> > > >
> > > > On Fri, Mar 19, 2021 at 05:16:15PM +1300, Barry Song wrote:
> > > > > diff --git a/Documentation/admin-guide/cputopology.rst
> > > > b/Documentation/admin-guide/cputopology.rst
> > > > > index b90dafc..f9d3745 100644
> > > > > --- a/Documentation/admin-guide/cputopology.rst
> > > > > +++ b/Documentation/admin-guide/cputopology.rst
> > > > > @@ -24,6 +24,12 @@ core_id:
> > > > >  	identifier (rather than the kernel's).  The actual value is
> > > > >  	architecture and platform dependent.
> > > > >
> > > > > +cluster_id:
> > > > > +
> > > > > +	the Cluster ID of cpuX.  Typically it is the hardware platform's
> > > > > +	identifier (rather than the kernel's).  The actual value is
> > > > > +	architecture and platform dependent.
> > > > > +
> > > > >  book_id:
> > > > >
> > > > >  	the book ID of cpuX. Typically it is the hardware platform's
> > > > > @@ -56,6 +62,14 @@ package_cpus_list:
> > > > >  	human-readable list of CPUs sharing the same physical_package_id.
> > > > >  	(deprecated name: "core_siblings_list")
> > > > >
> > > > > +cluster_cpus:
> > > > > +
> > > > > +	internal kernel map of CPUs within the same cluster.
> > > > > +
> > > > > +cluster_cpus_list:
> > > > > +
> > > > > +	human-readable list of CPUs within the same cluster.
> > > > > +
> > > > >  die_cpus:
> > > > >
> > > > >  	internal kernel map of CPUs within the same die.
> > > >
> > > > Why are these sysfs files in this file, and not in a Documentation/ABI/
> > > > file which can be correctly parsed and shown to userspace?
> > >
> > > Well. Those ABIs have been there for much a long time. It is like:
> > >
> > > [root at ceph1 topology]# ls
> > > core_id  core_siblings  core_siblings_list  physical_package_id
> thread_siblings  thread_siblings_list
> > > [root at ceph1 topology]# pwd
> > > /sys/devices/system/cpu/cpu100/topology
> > > [root at ceph1 topology]# cat core_siblings_list
> > > 64-127
> > > [root at ceph1 topology]#
> > >
> > > >
> > > > Any chance you can fix that up here as well?
> > >
> > > Yes. we will send a separate patch to address this, which won't
> > > be in this patchset. This patchset will base on that one.
> > >
> > > >
> > > > Also note that "list" is not something that goes in sysfs, sysfs is "one
> > > > value per file", and a list is not "one value".  How do you prevent
> > > > overflowing the buffer of the sysfs file if you have a "list"?
> > > >
> > >
> > > At a glance, the list is using "-" rather than a real list
> > > [root at ceph1 topology]# cat core_siblings_list
> > > 64-127
> > >
> > > Anyway, I will take a look if it has any chance to overflow.
> >
> > It could in theory be alternate CPUs as comma separated list.
> > So it's would get interesting around 500-1000 cpus (guessing).
> >
> > Hopefully no one has that crazy a cpu numbering scheme but it's possible
> > (note that cluster is fine for this, but I guess it might eventually
> > happen for core-siblings list (cpus within a package).
> >
> > Shouldn't crash or anything like that but might terminate early.
> 
> We have a broken sysfs api already for listing LED numbers that has had
> to be worked around in the past, please do not create a new one with
> that same problem, we should learn from them :)

Another place I am seeing a cpu list is in numa topology:
/sys/devices/system/node/nodex/cpulist.

But the code has a BUILD_BUG_ON to guard the pagebuf:

static ssize_t node_read_cpumap(struct device *dev, bool list, char *buf)
{
	ssize_t n;
	cpumask_var_t mask;
	struct node *node_dev = to_node(dev);

	/* 2008/04/07: buf currently PAGE_SIZE, need 9 chars per 32 bits. */
	BUILD_BUG_ON((NR_CPUS/32 * 9) > (PAGE_SIZE-1));

	if (!alloc_cpumask_var(&mask, GFP_KERNEL))
		return 0;

	cpumask_and(mask, cpumask_of_node(node_dev->dev.id), cpu_online_mask);
	n = cpumap_print_to_pagebuf(list, buf, mask);
	free_cpumask_var(mask);

	return n;
}

For lists in cpu topology, I haven't seen this while I believe we need it.
Or am I missing something?

> 
> thanks,
> 
> greg k-h

Thanks
Barry




More information about the linux-arm-kernel mailing list