[PATCH v5 4/8] crash: generic crash hotplug support infrastructure

Baoquan He bhe at redhat.com
Mon Mar 28 18:10:34 PDT 2022


On 03/28/22 at 11:08am, Eric DeVolder wrote:
> Baoquan, a comment below.
> eric
> 
> On 3/24/22 09:37, Eric DeVolder wrote:
> > 
> > 
> > On 3/24/22 09:33, Baoquan He wrote:
> > > On 03/24/22 at 08:53am, Eric DeVolder wrote:
> > > > Baoquan,
> > > > Thanks, I've offered a minor correction below.
> > > > eric
> > > > 
> > > > On 3/24/22 08:49, Baoquan He wrote:
> > > > > On 03/24/22 at 09:38pm, Baoquan He wrote:
> > > > > > On 03/03/22 at 11:27am, Eric DeVolder wrote:
> > > > > > > This patch introduces a generic crash hot plug/unplug infrastructure
> > > > > > > for CPU and memory changes. Upon CPU and memory changes, a generic
> > > > > > > crash_hotplug_handler() obtains the appropriate lock, does some
> > > > > > > important house keeping and then dispatches the hot plug/unplug event
> > > > > > > to the architecture specific arch_crash_hotplug_handler(), and when
> > > > > > > that handler returns, the lock is released.
> > > > > > > 
> > > > > > > This patch modifies crash_core.c to implement a subsys_initcall()
> > > > > > > function that installs handlers for hot plug/unplug events. If CPU
> > > > > > > hotplug is enabled, then cpuhp_setup_state() is invoked to register a
> > > > > > > handler for CPU changes. Similarly, if memory hotplug is enabled, then
> > > > > > > register_memory_notifier() is invoked to install a handler for memory
> > > > > > > changes. These handlers in turn invoke the common generic handler
> > > > > > > crash_hotplug_handler().
> > > > > > > 
> > > > > > > On the CPU side, cpuhp_setup_state_nocalls() is invoked with parameter
> > > > > > > CPUHP_AP_ONLINE_DYN. While this works, when a CPU is being unplugged,
> > > > > > > the CPU still shows up in foreach_present_cpu() during the regeneration
> > > > > > > of the new CPU list, thus the need to explicitly check and exclude the
> > > > > > > soon-to-be offlined CPU in crash_prepare_elf64_headers().
> > > > > > > 
> > > > > > > On the memory side, each un/plugged memory block passes through the
> > > > > > > handler. For example, if a 1GiB DIMM is hotplugged, that generate 8
> > > > > > > memory events, one for each 128MiB memblock.
> > > > > > 
> > > > > > I rewrite the log as below with my understanding. Hope it's simpler to
> > > > > > help people get what's going on here. Please consider to take if it's
> > > > > > OK to you or adjust based on this. The code looks good to me.
> > > > > > 
> > > > > Made some tuning:
> > > > > 
> > > > > crash: add generic infrastructure for crash hotplug support
> > > > > 
> > > > > Upon CPU and memory changes, a generic crash_hotplug_handler() is added
> > > > > to dispatch the hot plug/unplug event to the architecture specific
> > > > > arch_crash_hotplug_handler(). During the process, kexec_mutex need be
> > > > > held.
> > > > > 
> > > > > To support cpu hotplug, one callback pair are registered to capture
> > > > > KEXEC_CRASH_HP_ADD_CPU and KEXEC_CRASH_HP_REMOVE_CPU events via
> > > > > cpuhp_setup_state_nocalls().
> > > > s/KEXEC_CRASH_HP_ADD}REMOVE_CPU/CPUHP_AP_ONLINE_DYN/ as the KEXEC_CRASH are the
> > > > names I've introduced with this patch?
> > > 
> > > Right.
> > > 
> > > While checking it, I notice hp_action which you don't use actually.
> > > Can you reconsider that part of design, the hp_action, the a, b
> > > parameter passed to handler?
> > 
> > Sure I can remove. I initially put in there as this was generic
> > infrastructure and not sure if it would benefit others.
> > eric
> > 
> 
> Actually, I will keep the hp_action as the work by Sourabh Jain for PPC uses
> the hp_action. I'll drop the a and b.

Sounds great.

> 
> Also, shall I post v6, or are you still looking at patches 7 and 8?

Will check today, thanks for the effort.




More information about the kexec mailing list