[PATCH 0/1] irqbalance: Fix min_load to pick actual min load across all objects
Neil Horman
nhorman at tuxdriver.com
Tue Nov 14 12:20:34 PST 2017
On Tue, Nov 14, 2017 at 07:20:33PM +0000, Vallish Vaidyeshwara wrote:
> We recently encountered a corner case bug in irqbalance where irqbalance
> was not balancing interrupts across 2 CPU's.
>
We've moved to github:
https://github.com/Irqbalance/irqbalance
If you wouldn't mind opening a pull request or an issue for this, we can happily
take a look at it there
Best
Neil
> The code snippet in question is:
> static void gather_load_stats(struct topo_obj *obj, void *data)
> {
> struct load_balance_info *info = data;
>
> if (info->min_load == 0 || obj->load < info->min_load)
> info->min_load = obj->load;
> info->total_load += obj->load;
> info->load_sources += 1;
> }
>
> The bug we encountered had 2 CPU's with load for the first CPU being 0.
> Eg: obj1->load = 0 and obj2->load = 5000.
>
> Iteration 1: obj1 is passed and info initialized to 0 is passed in
> as param.
> if (info->min_load == 0 || obj->load < info->min_load)
> info->min_load = obj->load;
> Because info->min_load is 0, info->min_load is set to obj->load which
> is also 0 in this case.
>
> Iteration 2: obj2 is passed with info having values set from previous
> iteration:
> info->min_load = 0, info->total_load = 0, info->load_sources = 1
>
> if (info->min_load == 0 || obj->load < info->min_load)
> info->min_load = obj->load;
> Because of the logic used in gather_load_stats() as shown above,
> info->min_load gets set to obj2->load which is 5000. This is not the
> minimum load. Because of this bug, interrupts do not migrate as this
> value is checked in the function which migrates the interrupts.
>
> Test results:
> -------------
> i3.large system - single socket, single core, hyperthreading enabled
>
> With fix:
> [ec2-user at ip-10-0-38-254 tmp]$ cat /proc/interrupts | grep "\<CPU0\>\|\<60\>\|\<61\>\|\<66\>\|\<67\>\|\<69\>\|\<70\>"
> CPU0 CPU1
> 60: 1593819 292297 xen-pirq-msi-x nvme0q0, nvme0q1
> 61: 456537 831877 xen-pirq-msi-x nvme0q2
> 66: 1498 1644 xen-pirq-msi-x eth1-Tx-Rx-0
> 67: 1535 1034 xen-pirq-msi-x eth1-Tx-Rx-1
> 69: 3579919 8973629 xen-pirq-msi-x eth2-Tx-Rx-0
> 70: 9265764 3747096 xen-pirq-msi-x eth2-Tx-Rx-1
>
> Without fix:
> [ec2-user at ip-10-0-10-75 ~]$ cat /proc/interrupts | grep "\<CPU0\>\|\<60\>\|\<61\>\|\<66\>\|\<67\>\|\<69\>\|\<70\>"
> CPU0 CPU1
> 60: 4014 1088758 xen-pirq-msi-x nvme0q0, nvme0q1
> 61: 2046 2218741 xen-pirq-msi-x nvme0q2
> 66: 11 2978 xen-pirq-msi-x eth1-Tx-Rx-0
> 67: 8 43 xen-pirq-msi-x eth1-Tx-Rx-1
> 69: 43 13773325 xen-pirq-msi-x eth2-Tx-Rx-0
> 70: 4 13102621 xen-pirq-msi-x eth2-Tx-Rx-1
>
> Vallish Vaidyeshwara (1):
> irqbalance: Fix min_load to pick actual min load across all objects
>
> irqlist.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --
> 2.7.5
>
>
> _______________________________________________
> irqbalance mailing list
> irqbalance at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/irqbalance
>
More information about the irqbalance
mailing list