[PATCH v5 0/6] In kernel handling of CPU hotplug events for crash kernel
Sourabh Jain
sourabhjain at linux.ibm.com
Sun Nov 20 15:25:02 PST 2022
This patch series implements the crash hotplug handler on PowerPC introduced
by https://lkml.org/lkml/2022/10/31/854 patch series.
The Problem:
============
Post hotplug/DLPAR events the capture kernel holds stale information about the
system. Dump collection with stale capture kernel might end up in dump capture
failure or an inaccurate dump collection.
Existing solution:
==================
The existing solution to keep the capture kernel up-to-date by monitoring
hotplug event via udev rule and trigger a full capture kernel reload for
every hotplug event.
Shortcomings:
------------------------------------------------
- Leaves a window where kernel crash might not lead to a successful dump
collection.
- Reloading all kexec components for each hotplug is inefficient.
- udev rules are prone to races if hotplug events are frequent.
More about issues with an existing solution is posted here:
- https://lkml.org/lkml/2020/12/14/532
- https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-February/240254.html
Proposed Solution:
==================
Instead of reloading all kexec segments on hotplug event, this patch series
focuses on updating only the relevant kexec segment. Once the kexec segments
are loaded in the kernel reserved area then an arch-specific hotplug handler
will update the relevant kexec segment based on hotplug event type.
---
Changelog:
v5 -> v6
- Added crash memory hotplug support
v4 -> v5:
- Replace COFNIG_CRASH_HOTPLUG with CONFIG_HOTPLUG_CPU.
- Move fdt segment identification for kexec_load case to load path
instead of crash hotplug handler
- Keep new attribute defined under kimage_arch to track FDT segment
under CONFIG_HOTPLUG_CPU config.
v3 -> v4:
- Update the logic to find the additional space needed for hotadd CPUs post
kexec load. Refer "[RFC v4 PATCH 4/5] powerpc/crash hp: add crash hotplug
support for kexec_file_load" patch to know more about the change.
- Fix a couple of typo.
- Replace pr_err to pr_info_once to warn user about memory hotplug
support.
- In crash hotplug handle exit the for loop if FDT segment is found.
v2 -> v3
- Move fdt_index and fdt_index_vaild variables to kimage_arch struct.
- Rebase patche on top of https://lkml.org/lkml/2022/3/3/674 [v5]
- Fixed warning reported by checpatch script
v1 -> v2:
- Use generic hotplug handler introduced by https://lkml.org/lkml/2022/2/9/1406, a
significant change from v1.
---
Sourabh Jain (6):
powerpc/kexec: turn some static helper functions public
powerpc/crash: update kimage_arch struct
crash: add phdr for possible CPUs in elfcorehdr
powerpc/crash: add crash CPU hotplug support
crash: forward memory_notify args to arch crash hotplug handler
powerpc/kexec: add crash memory hotplug support
arch/powerpc/include/asm/kexec.h | 19 ++
arch/powerpc/include/asm/kexec_ranges.h | 1 +
arch/powerpc/kexec/core_64.c | 293 ++++++++++++++++++++++++
arch/powerpc/kexec/elf_64.c | 72 +++++-
arch/powerpc/kexec/file_load_64.c | 179 ++-------------
arch/powerpc/kexec/ranges.c | 60 +++++
arch/x86/include/asm/kexec.h | 2 +-
arch/x86/kernel/crash.c | 3 +-
include/linux/kexec.h | 3 +-
kernel/crash_core.c | 25 +-
10 files changed, 477 insertions(+), 180 deletions(-)
--
2.38.1
More information about the kexec
mailing list