[PATCH v8 2/2] sched/fair: Scan cluster before scanning LLC in wake-up path

Chen Yu yu.c.chen at intel.com
Sun Jun 11 22:22:16 PDT 2023


On 2023-06-12 at 10:31:39 +0530, Gautham R. Shenoy wrote:
> Hello Yicong,
> 
> 
> On Tue, May 30, 2023 at 03:02:53PM +0800, Yicong Yang wrote:
> > From: Barry Song <song.bao.hua at hisilicon.com>
> [..snip..]
> 
> > @@ -7103,7 +7127,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> >  	bool has_idle_core = false;
> >  	struct sched_domain *sd;
> >  	unsigned long task_util, util_min, util_max;
> > -	int i, recent_used_cpu;
> > +	int i, recent_used_cpu, prev_aff = -1;
> >  
> >  	/*
> >  	 * On asymmetric system, update task utilization because we will check
> > @@ -7130,8 +7154,11 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> >  	 */
> >  	if (prev != target && cpus_share_cache(prev, target) &&
> >  	    (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
> > -	    asym_fits_cpu(task_util, util_min, util_max, prev))
> > -		return prev;
> > +	    asym_fits_cpu(task_util, util_min, util_max, prev)) {
> > +		if (cpus_share_lowest_cache(prev, target))
> 
> For platforms without the cluster domain, the cpus_share_lowest_cache
> check is a repetition of the cpus_share_cache(prev, target) check. Can
> we avoid this using a static branch check for cluster ?
> 
>
Sounds good. 
> > +			return prev;
> > +		prev_aff = prev;
> > +	}
> >  
> >  	/*
> >  	 * Allow a per-cpu kthread to stack with the wakee if the
> > @@ -7158,7 +7185,10 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> >  	    (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu)) &&
> >  	    cpumask_test_cpu(p->recent_used_cpu, p->cpus_ptr) &&
> >  	    asym_fits_cpu(task_util, util_min, util_max, recent_used_cpu)) {
> > -		return recent_used_cpu;
> > +		if (cpus_share_lowest_cache(recent_used_cpu, target))
> 
> Same here.
> 
> > +			return recent_used_cpu;
> > +	} else {
> > +		recent_used_cpu = -1;
> >  	}
> >  
> >  	/*
> > @@ -7199,6 +7229,17 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> >  	if ((unsigned)i < nr_cpumask_bits)
> >  		return i;
> >  
> > +	/*
> > +	 * For cluster machines which have lower sharing cache like L2 or
> > +	 * LLC Tag, we tend to find an idle CPU in the target's cluster
> > +	 * first. But prev_cpu or recent_used_cpu may also be a good candidate,
> > +	 * use them if possible when no idle CPU found in select_idle_cpu().
> > +	 */
> > +	if ((unsigned int)prev_aff < nr_cpumask_bits)
> > +		return prev_aff;
> 
> Shouldn't we check if prev_aff (and the recent_used_cpu below) is
> still idle ?
> 
>
When we reach here, the target is non-idle, and the prev_aff is idle.
Although there is a race condition that prev_aff becomes non-idle
and target becomes idle after select_idle_cpu(), this window might be
small IMO.

thanks,
Chenyu 
> > +	if ((unsigned int)recent_used_cpu < nr_cpumask_bits)
> > +		return recent_used_cpu;
> > +
> >  	return target;
> >  }
> >  
> 
> --
> Thanks and Regards
> gautham.



More information about the linux-arm-kernel mailing list