[PATCH v21 2/7] crash: add generic infrastructure for crash hotplug support

Sourabh Jain sourabhjain at linux.ibm.com
Wed Apr 12 01:38:42 PDT 2023


On 06/04/23 21:40, Eric DeVolder wrote:
>
>
> On 4/6/23 06:04, Baoquan He wrote:
>> On 04/04/23 at 02:03pm, Eric DeVolder wrote:
>> ......
>>> +static void crash_handle_hotplug_event(unsigned int hp_action, 
>>> unsigned int cpu)
>>> +{
>>> +    struct kimage *image;
>>> +
>>> +    /* Obtain lock while changing crash information */
>>> +    if (!kexec_trylock()) {
>>> +        pr_info("kexec_trylock() failed, elfcorehdr may be 
>>> inaccurate\n");
>>> +        return;
>>> +    }
>>> +
>>> +    /* Check kdump is not loaded */
>>> +    if (!kexec_crash_image)
>>> +        goto out;
>>> +
>>> +    image = kexec_crash_image;
>>> +
>>> +    if (hp_action == KEXEC_CRASH_HP_ADD_CPU ||
>>> +        hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
>>> +        pr_debug("hp_action %u, cpu %u\n", hp_action, cpu);
>>> +    else
>>> +        pr_debug("hp_action %u\n", hp_action);
>>
>> Seems we passed in the cpu number just for printing here. Wondering why
>> we don't print out hot added/removed memory ranges. Is the cpu number
>> printing necessary?
>>
> Baoquan,
>
> Ah, actually until recently it was used to track the 'offlinecpu' in 
> this function, but tglx pointed out that was un-necessary. That 
> resulted in dropping the code in this function dealing with 
> offlinecpu, leaving this as its only use in this function.
>
> The printing of cpu number is not necessary, but helpful; I use it for 
> debugging.
>
> The printing of memory range is also not necessary, but in order to do 
> that, should we choose to do so, requires passing in the memory range 
> to this function. This patch series did do this early on, and by v7 I 
> dropped it at your urging 
> (https://lore.kernel.org/lkml/20220401183040.1624-1-eric.devolder@oracle.com/). 
> At the time, I provided it since I considered this generic 
> infrastructure, but I could not defend it since x86 didn't need it. 
> However, PPC now needs this, and is now carrying this as part of PPC 
> support of CRASH_HOTPLUG 
> (https://lore.kernel.org/linuxppc-dev/20230312181154.278900-6-sourabhjain@linux.ibm.com/T/#u).
>
> If you'd rather I pickup the memory range handling again, I can do 
> that. I think I'd likely change this function to be:
>
>   void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
> cpu,
>      struct memory_notify *mhp);
>
> where on a CPU op the 'cpu' parameter would be valid and 'mhp' NULL, 
> and on a memory op,
> the 'mhp' would be valid and 'cpu' parameter invalid(0).
>
> I'd likely then stuff these two parameters into struct kimage so that 
> it can be utilized by arch-specific handler, if needed.
>
> And of course, would print out the memory range for debug purposes.

I think passing memory_notify as parameter is a better approach compare 
to adding the
same into struct kimage. Because once the crash hotplug event is served 
the memory_notify
object is not useful.

- Sourabh




More information about the kexec mailing list