[RFC PATCH 0/2] sched: Load Balancing using Per-entity-Load-tracking
preeti
preeti at linux.vnet.ibm.com
Wed Oct 17 08:20:50 EDT 2012
Hi Guys,
Can you please have a look at the below patchset? Your review comments
are very necessary and valuable.Thanks in advance.
> This patchset uses the per-entity-load-tracking patchset which will soon be
> available in the kernel.It is based on the tip/master tree and the first 8
> latest patches of sched:per-entity-load-tracking alone have been imported to
> the tree to avoid the complexities of task groups and to hold back the
> optimizations of this patch for now.
>
> This patchset is an attempt to begin the integration of Per-entity-load-
> metric for the cfs_rq,henceforth referred to as PJT's metric,with the load
> balancer in a step wise fashion,and progress based on the consequences.
>
> The following issues have been considered towards this:
> [NOTE:an x% task referred to in the logs and below is calculated over a
> duty cycle of 10ms.]
>
> 1.Consider a scenario,where there are two 10% tasks running on a cpu.The
> present code will consider the load on this queue to be 2048,while
> using PJT's metric the load is calculated to be <1000,rarely exceeding this
> limit.Although the tasks are not contributing much to the cpu load,they are
> decided to be moved by the scheduler.
>
> But one could argue that 'not moving one of these tasks could throttle
> them.If there was an idle cpu,perhaps we could have moved them'.While the
> power save mode would have been fine with not moving the task,the
> performance mode would prefer not to throttle the tasks.We could strive
> to strike a balance by making this decision tunable with certain parameters.
> This patchset includes such tunables.This issue is addressed in Patch[1/2].
>
> 2.We need to be able to do this cautiously,as the scheduler code is too
> complex.This patchset is an attempt to begin the integration of PJT's
> metric with the load balancer in a step wise fashion,and progress based on
> the consequences.
> I dont intend to vary the parameters used by the load balancer.Some
> parameters are however included anew to make decisions about including a
> sched group as a candidate for load balancing.
>
> This patchset therefore has two primary aims.
> Patch[1/2]: This patch aims at detecting short running tasks and
> prevent their movement.In update_sg_lb_stats,dismiss a sched group
> as a candidate for load balancing,if load calculated by PJT's metric
> says that the average load on the sched_group <= 1024+(.15*1024).
> This is a tunable,which can be varied after sufficient experiments.
>
> Patch[2/2]:In the current scheduler greater load would be analogous
> to more number of tasks.Therefore when the busiest group is picked
> from the sched domain in update_sd_lb_stats,only the loads of the
> groups are compared between them.If we were to use PJT's metric,a
> higher load does not necessarily mean more number of tasks.This
> patch addresses this issue.
>
> 3.The next step towards integration should be in using the PJT's metric for
> comparison between the loads of the busy sched group and the sched
> group which has to pull the tasks,which happens in find_busiest_group.
> ---
>
> Preeti U Murthy (2):
> sched:Prevent movement of short running tasks during load balancing
> sched:Pick the apt busy sched group during load balancing
>
>
> kernel/sched/fair.c | 38 +++++++++++++++++++++++++++++++++++---
> 1 file changed, 35 insertions(+), 3 deletions(-)
>
> --
The links to PATCH[1/2] https://lkml.org/lkml/2012/10/12/13
PATCH[2/2] https://lkml.org/lkml/2012/10/12/11
Regards
Preeti U Murthy
More information about the linux-arm-kernel
mailing list