BUG: HANG_DETECT waiting for migration_cpu_stop() complete
Waiman Long
longman at redhat.com
Tue Sep 6 14:02:20 PDT 2022
On 9/6/22 16:50, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 04:40:03PM -0400, Waiman Long wrote:
>
> I've not followed the earlier stuff due to being unreadable; just
> reacting to this..
>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 838623b68031..5d9ea1553ec0 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -2794,9 +2794,9 @@ static int __set_cpus_allowed_ptr_locked(struct
>> task_struct *p,
>> if (cpumask_equal(&p->cpus_mask, new_mask))
>> goto out;
>>
>> - if (WARN_ON_ONCE(p == current &&
>> - is_migration_disabled(p) &&
>> - !cpumask_test_cpu(task_cpu(p), new_mask)))
>> {
>> + if (is_migration_disabled(p) &&
>> + !cpumask_test_cpu(task_cpu(p), new_mask)) {
>> + WARN_ON_ONCE(p == current);
>> ret = -EBUSY;
>> goto out;
>> }
>> @@ -2818,7 +2818,11 @@ static int __set_cpus_allowed_ptr_locked(struct
>> task_struct *p,
>> if (flags & SCA_USER)
>> user_mask = clear_user_cpus_ptr(p);
>>
>> - ret = affine_move_task(rq, p, rf, dest_cpu, flags);
>> + if (!is_migration_disabled(p) || (flags & SCA_MIGRATE_ENABLE)) {
>> + ret = affine_move_task(rq, p, rf, dest_cpu, flags);
>> + } else {
>> + task_rq_unlock(rq, p, rf);
>> + }
> This cannot be right. There might be previous set_cpus_allowed_ptr()
> callers that are blocked and waiting for the task to land on a valid
> CPU.
You are probably right. I haven't fully understand all the migration
disable code yet. However, if migration is disabled, there are some
corner cases we need to handle properly.
Cheers,
Longman
More information about the Linux-mediatek
mailing list