[PATCHv2 0/2]: kexec: Force kexec to proceed under heavy deadline load

Pingfan Liu piliu at redhat.com
Mon Oct 27 20:09:12 PDT 2025


During discussion of the scheduler deadline bug [1], Pierre Gondois
pointed out a potential issue during kexec: as CPUs are unplugged, the
available DL bandwidth of the root domain gradually decreases. At some
point, insufficient bandwidth triggers an overflow detection, causing
CPU hot-removal to fail and kexec to hang.[2]
    
I reproduced it on a system with 160 cpus with the following command
  seq 10 | xargs -I{} -P10 sh -c 'chrt -d -T 1000000 -P 1000000 0 yes > /dev/null &'
  kexec -e

The system hang during the kexec process.
 
This series skips the DL bandwidth check, SIGSTOP all DL tasks so that
the kexec process can proceed.

[1]: https://lore.kernel.org/all/20250929133602.32462-1-piliu@redhat.com/
[2]: https://lore.kernel.org/all/3408aca5-e6c9-434a-9950-82e9147fcbba@arm.com/

RFC -> v2:
Instead of migrating the DL tasks, SIGSTOP them.

Pingfan Liu (2):
  sched/deadline: Skip the deadline bandwidth check if kexec_in_progress
  kernel/kexec: Stop all userspace deadline tasks

 kernel/kexec_core.c     | 23 +++++++++++++++++++++++
 kernel/sched/deadline.c |  7 +++++++
 2 files changed, 30 insertions(+)

-- 
2.49.0




More information about the kexec mailing list