[PATCHv2 0/2]: kexec: Force kexec to proceed under heavy deadline load
Pingfan Liu
piliu at redhat.com
Mon Oct 27 20:09:12 PDT 2025
During discussion of the scheduler deadline bug [1], Pierre Gondois
pointed out a potential issue during kexec: as CPUs are unplugged, the
available DL bandwidth of the root domain gradually decreases. At some
point, insufficient bandwidth triggers an overflow detection, causing
CPU hot-removal to fail and kexec to hang.[2]
I reproduced it on a system with 160 cpus with the following command
seq 10 | xargs -I{} -P10 sh -c 'chrt -d -T 1000000 -P 1000000 0 yes > /dev/null &'
kexec -e
The system hang during the kexec process.
This series skips the DL bandwidth check, SIGSTOP all DL tasks so that
the kexec process can proceed.
[1]: https://lore.kernel.org/all/20250929133602.32462-1-piliu@redhat.com/
[2]: https://lore.kernel.org/all/3408aca5-e6c9-434a-9950-82e9147fcbba@arm.com/
RFC -> v2:
Instead of migrating the DL tasks, SIGSTOP them.
Pingfan Liu (2):
sched/deadline: Skip the deadline bandwidth check if kexec_in_progress
kernel/kexec: Stop all userspace deadline tasks
kernel/kexec_core.c | 23 +++++++++++++++++++++++
kernel/sched/deadline.c | 7 +++++++
2 files changed, 30 insertions(+)
--
2.49.0
More information about the kexec
mailing list