[PATCH RFC 0/5] support NUMA emulation for arm64

Pierre Gondois pierre.gondois at arm.com
Thu Oct 12 05:37:43 PDT 2023


Hello Rongwei,

On 10/12/23 04:48, Rongwei Wang wrote:
> A brief introduction
> ====================
> 
> The NUMA emulation can fake more node base on a single
> node system, e.g.
> 
> one node system:
> 
> [root at localhost ~]# numactl -H
> available: 1 nodes (0)
> node 0 cpus: 0 1 2 3 4 5 6 7
> node 0 size: 31788 MB
> node 0 free: 31446 MB
> node distances:
> node   0
>    0:  10
> 
> add numa=fake=2 (fake 2 node on each origin node):
> 
> [root at localhost ~]# numactl -H
> available: 2 nodes (0-1)
> node 0 cpus: 0 1 2 3 4 5 6 7
> node 0 size: 15806 MB
> node 0 free: 15451 MB
> node 1 cpus: 0 1 2 3 4 5 6 7
> node 1 size: 16029 MB
> node 1 free: 15989 MB
> node distances:
> node   0   1
>    0:  10  10
>    1:  10  10
> 
> As above shown, a new node has been faked. As cpus, the realization
> of x86 NUMA emulation is kept. Maybe each node should has 4 cores is
> better (not sure, next to do if so).
> 
> Why do this
> ===========
> 
> It seems has following reasons:
>    (1) In x86 host, apply NUMA emulation can fake more nodes environment
>        to test or verify some performance stuff, but arm64 only has
>        one method that modify ACPI table to do this. It's troublesome
>        more or less.
>    (2) Reduce competition for some locks. Here an example we found:
>        will-it-scale/tlb_flush1_processes -t 96 -s 10, it shows obvious
>        hotspot on lruvec->lock when test in single environment. What's
>        more, The performance improved greatly if test in two more nodes
>        system. The data shows below (more is better):
> 
>        ---------------------------------------------------------------------
>        threads/process |   1     |     12   |     24   |   48     |   96
>        ---------------------------------------------------------------------
>        one node        | 14 1122 | 110 5372 | 111 2615 | 79 7084  | 72 4516
>        ---------------------------------------------------------------------
>        numa=fake=2     | 14 1168 | 144 4848 | 215 9070 | 157 0412 | 142 3968
>        ---------------------------------------------------------------------
>                        | For concurrency 12, no lruvec->lock hotspot. For 24,
>        hotspot         | one node has 24% hotspot on lruvec->lock, but
>                        | two nodes env hasn't.
>        ---------------------------------------------------------------------
> 
> As for risks (e.g. numa balance...), they need to be discussed here.
> 
> Lastly, this just is a draft, I can improve next if it's acceptable.

I'm not engaging on the utility/relevance of the patch-set, but I tried
them on an arm64 system with the 'numa=fake=2' parameter and could not
see 2 nodes being created under:
   /sys/devices/system/node/
Indeed it seems that even though numa_emulation() is moved to a generic
mm/numa.c file, the function is only called from:
   arch/x86/mm/numa.c:numa_init()
(or maybe I'm misinterpreting the intent of the patches).

Also I had the following errors when building (still for arm64):
mm/numa.c:862:8: error: implicit declaration of function 'early_cpu_to_node' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
         nid = early_cpu_to_node(cpu);
               ^
mm/numa.c:862:8: note: did you mean 'early_map_cpu_to_node'?
./include/asm-generic/numa.h:37:13: note: 'early_map_cpu_to_node' declared here
void __init early_map_cpu_to_node(unsigned int cpu, int nid);
             ^
mm/numa.c:874:3: error: implicit declaration of function 'debug_cpumask_set_cpu' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
                 debug_cpumask_set_cpu(cpu, nid, enable);
                 ^
mm/numa.c:874:3: note: did you mean '__cpumask_set_cpu'?
./include/linux/cpumask.h:474:29: note: '__cpumask_set_cpu' declared here
static __always_inline void __cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp)
                             ^
2 errors generated.

Regards,
Pierre

> 
> Thanks!
> 
> Rongwei Wang (5):
>    mm/numa: move numa emulation APIs into generic files
>    mm: percpu: fix variable type of cpu
>    arch_numa: remove __init in early_cpu_to_node()
>    mm/numa: support CONFIG_NUMA_EMU for arm64
>    mm/numa: migrate leftover numa emulation into mm/numa.c
> 
>   arch/x86/Kconfig                          |   8 -
>   arch/x86/include/asm/numa.h               |   3 -
>   arch/x86/mm/Makefile                      |   1 -
>   arch/x86/mm/numa.c                        | 216 +-------------
>   arch/x86/mm/numa_internal.h               |  14 +-
>   drivers/base/arch_numa.c                  |   7 +-
>   include/asm-generic/numa.h                |  33 +++
>   include/linux/percpu.h                    |   2 +-
>   mm/Kconfig                                |   8 +
>   mm/Makefile                               |   1 +
>   arch/x86/mm/numa_emulation.c => mm/numa.c | 333 +++++++++++++++++++++-
>   11 files changed, 373 insertions(+), 253 deletions(-)
>   rename arch/x86/mm/numa_emulation.c => mm/numa.c (63%)
> 



More information about the linux-arm-kernel mailing list