[RESEND PATCH v5 2/2] sched/fair: Scan cluster before scanning LLC in wake-up path

Yicong Yang yangyicong at huawei.com
Thu Jul 21 05:42:12 PDT 2022


On 2022/7/21 18:33, Peter Zijlstra wrote:
> On Thu, Jul 21, 2022 at 09:38:04PM +1200, Barry Song wrote:
>> On Wed, Jul 20, 2022 at 11:15 PM Peter Zijlstra <peterz at infradead.org> wrote:
>>>
>>> On Wed, Jul 20, 2022 at 04:11:50PM +0800, Yicong Yang wrote:
>>>> +     /* TODO: Support SMT system with cluster topology */
>>>> +     if (!sched_smt_active() && sd) {
>>>> +             for_each_cpu_and(cpu, cpus, sched_domain_span(sd)) {
>>>
>>> So that's no SMT and no wrap iteration..
>>>
>>> Does something like this work?
>>>
>>> ---
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -6437,6 +6437,30 @@ static int select_idle_cpu(struct task_s
>>>                 }
>>>         }
>>>
>>> +       if (IS_ENABLED(CONFIG_SCHED_CLUSTER) &&
>>> +           static_branch_unlikely(&sched_cluster_active)) {
>>> +               struct sched_domain *sdc = rcu_dereference(per_cpu(sd_cluster, target));
>>> +               if (sdc) {
>>> +                       for_each_cpu_wrap(cpu, sched_domain_span(sdc), target + 1) {
>>> +                               if (!cpumask_test_cpu(cpu, cpus))
>>> +                                       continue;
>>> +
>>> +                               if (has_idle_core) {
>>> +                                       i = select_idle_core(p, cpu, cpus, &idle_cpu);
>>> +                                       if ((unsigned int)i < nr_cpumask_bits)
>>> +                                               return i;
>>> +                               } else {
>>> +                                       if (--nr <= 0)
>>> +                                               return -1;
>>> +                                       idle_cpu = __select_idle_cpu(cpu, p);
>>> +                                       if ((unsigned int)idle_cpu < nr_cpumask_bits)
>>> +                                               break;
>>
>> Guess here it should be "return idle_cpu", but not "break". as "break"
>> will make us scan more
>> other cpus outside the cluster if we have found idle_cpu within the cluster.
>>

That can explain why the performance regress when underload.

>> Yicong,
>> Please test Peter's code with the above change.
> 
> Indeed. Sorry for that.
> 

The performance's still positive based on the tip/sched/core used in this patch's commit.
70fb5ccf2ebb ("sched/fair: Introduce SIS_UTIL to search idle CPU based on sum of util_avg").

On numa 0:
                           tip/core                 patched
Hmean     1        345.89 (   0.00%)      398.43 *  15.19%*
Hmean     2        697.77 (   0.00%)      794.40 *  13.85%*
Hmean     4       1392.51 (   0.00%)     1577.60 *  13.29%*
Hmean     8       2800.61 (   0.00%)     3118.38 *  11.35%*
Hmean     16      5514.27 (   0.00%)     6124.51 *  11.07%*
Hmean     32     10869.81 (   0.00%)    10690.97 *  -1.65%*
Hmean     64      8315.22 (   0.00%)     8520.73 *   2.47%*
Hmean     128     6324.47 (   0.00%)     7253.65 *  14.69%*

On numa 0-1:
                           tip/core                 patched
Hmean     1        348.68 (   0.00%)      397.74 *  14.07%*
Hmean     2        693.57 (   0.00%)      795.54 *  14.70%*
Hmean     4       1369.26 (   0.00%)     1548.72 *  13.11%*
Hmean     8       2772.99 (   0.00%)     3055.54 *  10.19%*
Hmean     16      4825.83 (   0.00%)     5936.64 *  23.02%*
Hmean     32     10250.32 (   0.00%)    11780.59 *  14.93%*
Hmean     64     16309.51 (   0.00%)    19864.38 *  21.80%*
Hmean     128    13022.32 (   0.00%)    16365.43 *  25.67%*
Hmean     256    11335.79 (   0.00%)    13991.33 *  23.43%*

Hi Peter,

Do you want me to respin a v6 based on your change?

Thanks,
Yicong



More information about the linux-arm-kernel mailing list