[PATCH 0/1] irqbalance: Fix min_load to pick actual min load across all objects
Vallish Vaidyeshwara
vallish at amazon.com
Tue Nov 14 11:20:33 PST 2017
We recently encountered a corner case bug in irqbalance where irqbalance
was not balancing interrupts across 2 CPU's.
The code snippet in question is:
static void gather_load_stats(struct topo_obj *obj, void *data)
{
struct load_balance_info *info = data;
if (info->min_load == 0 || obj->load < info->min_load)
info->min_load = obj->load;
info->total_load += obj->load;
info->load_sources += 1;
}
The bug we encountered had 2 CPU's with load for the first CPU being 0.
Eg: obj1->load = 0 and obj2->load = 5000.
Iteration 1: obj1 is passed and info initialized to 0 is passed in
as param.
if (info->min_load == 0 || obj->load < info->min_load)
info->min_load = obj->load;
Because info->min_load is 0, info->min_load is set to obj->load which
is also 0 in this case.
Iteration 2: obj2 is passed with info having values set from previous
iteration:
info->min_load = 0, info->total_load = 0, info->load_sources = 1
if (info->min_load == 0 || obj->load < info->min_load)
info->min_load = obj->load;
Because of the logic used in gather_load_stats() as shown above,
info->min_load gets set to obj2->load which is 5000. This is not the
minimum load. Because of this bug, interrupts do not migrate as this
value is checked in the function which migrates the interrupts.
Test results:
-------------
i3.large system - single socket, single core, hyperthreading enabled
With fix:
[ec2-user at ip-10-0-38-254 tmp]$ cat /proc/interrupts | grep "\<CPU0\>\|\<60\>\|\<61\>\|\<66\>\|\<67\>\|\<69\>\|\<70\>"
CPU0 CPU1
60: 1593819 292297 xen-pirq-msi-x nvme0q0, nvme0q1
61: 456537 831877 xen-pirq-msi-x nvme0q2
66: 1498 1644 xen-pirq-msi-x eth1-Tx-Rx-0
67: 1535 1034 xen-pirq-msi-x eth1-Tx-Rx-1
69: 3579919 8973629 xen-pirq-msi-x eth2-Tx-Rx-0
70: 9265764 3747096 xen-pirq-msi-x eth2-Tx-Rx-1
Without fix:
[ec2-user at ip-10-0-10-75 ~]$ cat /proc/interrupts | grep "\<CPU0\>\|\<60\>\|\<61\>\|\<66\>\|\<67\>\|\<69\>\|\<70\>"
CPU0 CPU1
60: 4014 1088758 xen-pirq-msi-x nvme0q0, nvme0q1
61: 2046 2218741 xen-pirq-msi-x nvme0q2
66: 11 2978 xen-pirq-msi-x eth1-Tx-Rx-0
67: 8 43 xen-pirq-msi-x eth1-Tx-Rx-1
69: 43 13773325 xen-pirq-msi-x eth2-Tx-Rx-0
70: 4 13102621 xen-pirq-msi-x eth2-Tx-Rx-1
Vallish Vaidyeshwara (1):
irqbalance: Fix min_load to pick actual min load across all objects
irqlist.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--
2.7.5
More information about the irqbalance
mailing list