[PATCH RESEND] lib/group_cpus: make group CPU cluster aware

Mon Nov 10 21:31:04 PST 2025

On 11/11/2025 11:25 AM, Ming Lei wrote:
> On Tue, Nov 11, 2025 at 10:06:08AM +0800, Wangyang Guo wrote:
>> As CPU core counts increase, the number of NVMe IRQs may be smaller than
>> the total number of CPUs. This forces multiple CPUs to share the same
>> IRQ. If the IRQ affinity and the CPU’s cluster do not align, a
>> performance penalty can be observed on some platforms.
> 
> Can you add details why/how CPU cluster isn't aligned with IRQ
> affinity? And how performance penalty is caused?

Intel Xeon E platform packs 4 CPU cores as 1 module (cluster) and share 
the L2 cache. Let's say, if there are 40 CPUs in 1 NUMA domain and 11 
IRQs to dispatch. The existing algorithm will map first 7 IRQs each with 
4 CPUs and remained 4 IRQs each with 3 CPUs each. The last 4 IRQs may 
have cross cluster issue. For example, the 9th IRQ which pinned to 
CPU32, then for CPU31, it will have cross L2 memory access.

CPU |28 29 30 31|32 33 34 35|36 ...
      -------- -------- --------
IRQ      8        9       10

If this patch applied, then first 2 IRQs each mapped with 2 CPUs and 
rest 9 IRQs each mapped with 4 CPUs, which avoids the cross cluster 
memory access.

CPU |00 01 02 03|04 05 06 07|08 09 10 11| ...
      ----- ----- ----------- -----------
IRQ  1      2        3           4

BR
Wangyang