[PATCH v21 2/7] crash: add generic infrastructure for crash hotplug support

Baoquan He bhe at redhat.com
Thu Apr 6 16:58:31 PDT 2023


On 04/06/23 at 11:10am, Eric DeVolder wrote:
> 
> 
> On 4/6/23 06:04, Baoquan He wrote:
> > On 04/04/23 at 02:03pm, Eric DeVolder wrote:
> > ......
> > > +static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
> > > +{
> > > +	struct kimage *image;
> > > +
> > > +	/* Obtain lock while changing crash information */
> > > +	if (!kexec_trylock()) {
> > > +		pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
> > > +		return;
> > > +	}
> > > +
> > > +	/* Check kdump is not loaded */
> > > +	if (!kexec_crash_image)
> > > +		goto out;
> > > +
> > > +	image = kexec_crash_image;
> > > +
> > > +	if (hp_action == KEXEC_CRASH_HP_ADD_CPU ||
> > > +		hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
> > > +		pr_debug("hp_action %u, cpu %u\n", hp_action, cpu);
> > > +	else
> > > +		pr_debug("hp_action %u\n", hp_action);
> > 
> > Seems we passed in the cpu number just for printing here. Wondering why
> > we don't print out hot added/removed memory ranges. Is the cpu number
> > printing necessary?
> > 
> Baoquan,
> 
> Ah, actually until recently it was used to track the 'offlinecpu' in this
> function, but tglx pointed out that was un-necessary. That resulted in
> dropping the code in this function dealing with offlinecpu, leaving this as
> its only use in this function.
> 
> The printing of cpu number is not necessary, but helpful; I use it for debugging.

OK, I see. I am not requesting memory range printing, just try to prove
cpu number printing is not so justified. If it's helpful, I am OK with
it. Let's see if other people have concern about this.

> 
> The printing of memory range is also not necessary, but in order to do that,
> should we choose to do so, requires passing in the memory range to this
> function. This patch series did do this early on, and by v7 I dropped it at
> your urging (https://lore.kernel.org/lkml/20220401183040.1624-1-eric.devolder@oracle.com/).
> At the time, I provided it since I considered this generic infrastructure,
> but I could not defend it since x86 didn't need it. However, PPC now needs
> this, and is now carrying this as part of PPC support of CRASH_HOTPLUG (https://lore.kernel.org/linuxppc-dev/20230312181154.278900-6-sourabhjain@linux.ibm.com/T/#u).
> 
> If you'd rather I pickup the memory range handling again, I can do that. I
> think I'd likely change this function to be:
> 
>   void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu,
>      struct memory_notify *mhp);
> 
> where on a CPU op the 'cpu' parameter would be valid and 'mhp' NULL, and on a memory op,
> the 'mhp' would be valid and 'cpu' parameter invalid(0).
> 
> I'd likely then stuff these two parameters into struct kimage so that it can
> be utilized by arch-specific handler, if needed.
> 
> And of course, would print out the memory range for debug purposes.
> 
> Let me know what you think.




More information about the kexec mailing list