[PATCH] arch/arm64: Fix topology initialization for core scheduling

Phil Auld pauld at redhat.com
Tue Mar 29 12:50:48 PDT 2022


On Tue, Mar 29, 2022 at 08:55:08PM +0200 Dietmar Eggemann wrote:
> On 29/03/2022 17:20, Phil Auld wrote:
> > On Tue, Mar 29, 2022 at 04:02:22PM +0200 Dietmar Eggemann wrote:
> >> On 22/03/2022 17:03, Phil Auld wrote:
> 
> [...]
> 
> >> I assume this is for a machine which relies on MPIDR-based setup
> >> (package_id == -1)? I.e. it doesn't have proper ACPI/(DT) data for
> >> topology setup.
> > 
> > Yes, that's my understanding. No PPTT.
> > 
> >>
> >> Tried on a ThunderX2 by disabling parse_acpi_topology() but then I end
> >> up with a machine w/o SMT, so `stress-ng --prctl N` doesn't show this issue.
> >>
> >> Which machine were you using?
> > 
> > This instance is an HPE Apollo 70 set to smt-4.  I believe it's ThunderX2
> > chips.
> > 
> > ARM (CN9980-2200LG4077-Y21-G) 
> I'm using the same processor just with ACPI/PPTT.
>

Maybe I'm misinformed about these systems having no PPTT...  

I'm reclaiming the system. Is there a way I can tell from userspace?


> # sudo dmidecode -t 4 | grep "Part Number"
> 	Part Number: CN9980-2200LG4077-21-Y-G
> 	Part Number: CN9980-2200LG4077-21-Y-G
> 
> # cat /sys/devices/system/cpu/cpu0/topology/thread_siblings
> 0,32,64,96
> 
> # cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
> SMT
> MC
> NUMA
> 
> But no matter whether I disable parse_acpi_topology() or just force
> `cpu_topology[cpu].package_id = -1` in this function, I always end up with:
> 
> # cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
> MC
> NUMA
> 
> # cat /sys/devices/system/cpu/cpu0/topology/thread_siblings_list
> 0
> 
> so no SMT sched domain. The MPIDR-based topology fallback code in
> store_cpu_topology() forces `cpuid_topo->thread_id  = -1`.

Right. So since I'm getting SMT it must not have package_id == -1.
In which case you should be able to reproduce it because it must
be that the call the update_siblings_masks() is required.  That
appears to only be called from store_cpu_topology() which is
after the scheduler has already setup the core pointers.

The fix could be the same but I should reword the commit message
since it should effect all SMT arm systems I'd think.

Or maybe the ACPI topology code should call update_sibling_masks().


> 
> IMHO this is why on my machine I don't see this issue while running:
> 
> root at oss-apollo7007:~# stress-ng --prctl 256 -t 60
> stress-ng: info:  [2388042] dispatching hogs: 256 prctl
> 
> Is there something I miss in my setup to provoke this issue?
>

Make sure you have a stress-ng that is new enough and built against
headers that have the CORE_SCHED prctls defined.


BTW, thanks for taking a look.


Cheers,
Phil

> [...]
> 

-- 




More information about the linux-arm-kernel mailing list