[PATCH v8 2/2] sched/fair: Scan cluster before scanning LLC in wake-up path

Gautham R. Shenoy gautham.shenoy at amd.com
Sun Jun 11 22:01:39 PDT 2023


Hello Yicong,


On Tue, May 30, 2023 at 03:02:53PM +0800, Yicong Yang wrote:
> From: Barry Song <song.bao.hua at hisilicon.com>
[..snip..]

> @@ -7103,7 +7127,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
>  	bool has_idle_core = false;
>  	struct sched_domain *sd;
>  	unsigned long task_util, util_min, util_max;
> -	int i, recent_used_cpu;
> +	int i, recent_used_cpu, prev_aff = -1;
>  
>  	/*
>  	 * On asymmetric system, update task utilization because we will check
> @@ -7130,8 +7154,11 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
>  	 */
>  	if (prev != target && cpus_share_cache(prev, target) &&
>  	    (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
> -	    asym_fits_cpu(task_util, util_min, util_max, prev))
> -		return prev;
> +	    asym_fits_cpu(task_util, util_min, util_max, prev)) {
> +		if (cpus_share_lowest_cache(prev, target))

For platforms without the cluster domain, the cpus_share_lowest_cache
check is a repetition of the cpus_share_cache(prev, target) check. Can
we avoid this using a static branch check for cluster ?


> +			return prev;
> +		prev_aff = prev;
> +	}
>  
>  	/*
>  	 * Allow a per-cpu kthread to stack with the wakee if the
> @@ -7158,7 +7185,10 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
>  	    (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu)) &&
>  	    cpumask_test_cpu(p->recent_used_cpu, p->cpus_ptr) &&
>  	    asym_fits_cpu(task_util, util_min, util_max, recent_used_cpu)) {
> -		return recent_used_cpu;
> +		if (cpus_share_lowest_cache(recent_used_cpu, target))

Same here.

> +			return recent_used_cpu;
> +	} else {
> +		recent_used_cpu = -1;
>  	}
>  
>  	/*
> @@ -7199,6 +7229,17 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
>  	if ((unsigned)i < nr_cpumask_bits)
>  		return i;
>  
> +	/*
> +	 * For cluster machines which have lower sharing cache like L2 or
> +	 * LLC Tag, we tend to find an idle CPU in the target's cluster
> +	 * first. But prev_cpu or recent_used_cpu may also be a good candidate,
> +	 * use them if possible when no idle CPU found in select_idle_cpu().
> +	 */
> +	if ((unsigned int)prev_aff < nr_cpumask_bits)
> +		return prev_aff;

Shouldn't we check if prev_aff (and the recent_used_cpu below) is
still idle ?


> +	if ((unsigned int)recent_used_cpu < nr_cpumask_bits)
> +		return recent_used_cpu;
> +
>  	return target;
>  }
>  

--
Thanks and Regards
gautham.



More information about the linux-arm-kernel mailing list