[PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id
Shijie Huang
shijie at amperemail.onmicrosoft.com
Thu Jan 18 22:46:16 PST 2024
在 2024/1/19 12:42, Yury Norov 写道:
> This adds another level of indirection, I think. Currently cpu_to_node
> is a simple inliner. After the patch it would be a real function with
> all the associate overhead. Can you share a bloat-o-meter output here?
#./scripts/bloat-o-meter vmlinux vmlinux.new
add/remove: 6/1 grow/shrink: 61/51 up/down: 1168/-588 (580)
Function old new delta
numa_update_cpu 148 244 +96
...................................................................................................................................(to many to skip)
Total: Before=32990130, After=32990710, chg +0.00%
>
> Regardless, I don't think that the approach is correct. As per your
> description, some initialization functions erroneously call
> cpu_to_node() instead of early_cpu_to_node() which exists specifically
> for that case.
>
> If the above correct, it's clearly a caller problem, and the fix is to
> simply switch all those callers to use early version.
It is easy to change to early_cpu_to_node() for sched_init(),
init_sched_fair_class()
and workqueue_init_early(). These three places call the cpu_to_node() in
the __init function.
But it is a little hard to change the early_trace_init(), since it calls
cpu_to_node in the deep
function stack:
early_trace_init() --> ring_buffer_alloc() -->rb_allocate_cpu_buffer()
For early_trace_init(), we need to change more code.
Anyway, If we think it is not a good idea to change the common code, I
am oaky too.
>
> I would also initialize the numa_node with NUMA_NO_NODE at declaration,
> so that if someone calls cpu_to_node() before the variable is properly
> initialized at runtime, he'll get NO_NODE, which is obviously an error.
Even we set the numa_node with NUMA_NO_NODE, it does not always produce
error.
Please see the alloc_pages_node().
Thanks
Huang Shijie
More information about the linux-arm-kernel
mailing list