[PATCH v12 3/7] crash: add generic infrastructure for crash hotplug support

Wed Oct 26 00:00:02 PDT 2022

Hello Baoquan,

On 24/10/22 14:40, Baoquan He wrote:
> Hi Eric, Sourabh,
>
> On 10/07/22 at 02:14pm, Eric DeVolder wrote:
>>
>> On 10/3/22 12:51, Sourabh Jain wrote:
>>> Hello Eric,
>>>
>>> On 10/09/22 02:35, Eric DeVolder wrote:
> ......
>>>> +static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
>>>> +{
>>>> +    /* Obtain lock while changing crash information */
>>>> +    mutex_lock(&kexec_mutex);
>>>> +
>>>> +    /* Check kdump is loaded */
>>>> +    if (kexec_crash_image) {
>>>> +        struct kimage *image = kexec_crash_image;
>>>> +
>>>> +        if (hp_action == KEXEC_CRASH_HP_ADD_CPU ||
>>>> +            hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
>>>> +            pr_debug("crash hp: hp_action %u, cpu %u\n", hp_action, cpu);
>>>> +        else
>>>> +            pr_debug("crash hp: hp_action %u\n", hp_action);
>>>> +
>>>> +        /*
>>>> +         * When the struct kimage is allocated, it is wiped to zero, so
>>>> +         * the elfcorehdr_index_valid defaults to false. Find the
>>>> +         * segment containing the elfcorehdr, if not already found.
>>>> +         * This works for both the kexec_load and kexec_file_load paths.
>>>> +         */
>>>> +        if (!image->elfcorehdr_index_valid) {
>>>> +            unsigned char *ptr;
>>>> +            unsigned long mem, memsz;
>>>> +            unsigned int n;
>>>> +
>>>> +            for (n = 0; n < image->nr_segments; n++) {
>>>> +                mem = image->segment[n].mem;
>>>> +                memsz = image->segment[n].memsz;
>>>> +                ptr = arch_map_crash_pages(mem, memsz);
>>>> +                if (ptr) {
>>>> +                    /* The segment containing elfcorehdr */
>>>> +                    if (memcmp(ptr, ELFMAG, SELFMAG) == 0) {
>>>> +                        image->elfcorehdr_index = (int)n;
>>>> +                        image->elfcorehdr_index_valid = true;
>>>> +                    }
>>>> +                }
>>>> +                arch_unmap_crash_pages((void **)&ptr);
>>>> +            }
>>>> +        }
>>>> +
>>>> +        if (!image->elfcorehdr_index_valid) {
>>>> +            pr_err("crash hp: unable to locate elfcorehdr segment");
>>>> +            goto out;
>>>> +        }
>>>> +
>>>> +        /* Needed in order for the segments to be updated */
>>>> +        arch_kexec_unprotect_crashkres();
>>>> +
>>>> +        /* Flag to differentiate between normal load and hotplug */
>>>> +        image->hotplug_event = true;
>>>> +
>>>> +        /* Now invoke arch-specific update handler */
>>>> +        arch_crash_handle_hotplug_event(image, hp_action);
>>>> +
>>>> +        /* No longer handling a hotplug event */
>>>> +        image->hotplug_event = false;
>>>> +
>>>> +        /* Change back to read-only */
>>>> +        arch_kexec_protect_crashkres();
>>>> +    }
>>>> +
>>>> +out:
>>>> +    /* Release lock now that update complete */
>>>> +    mutex_unlock(&kexec_mutex);
>>>> +}
>>>> +
>>>> +static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, void *v)
>>>> +{
>>>> +    switch (val) {
>>>> +    case MEM_ONLINE:
>>>> +        handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY, 0);
>>>> +        break;
>>>> +
>>>> +    case MEM_OFFLINE:
>>>> +        handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_MEMORY, 0);
>>>> +        break;
>>>> +    }
>>>> +    return NOTIFY_OK;
>>> Can we pass v (memory_notify) argument to arch_crash_handle_hotplug_event function
>>> via handle_hotplug_event?
>>>
>>> Because the way memory hotplug is handled on PowerPC, it is hard to update the elfcorehdr
>>> without memory_notify args.
>>>
>>> On PowePC memblock data structure is used to prepare elfcorehdr for kdump. Since the notifier
>>> used for memory hotplug crash handler get initiated before the memblock data structure update
>>> happens (as depicted below), the newly prepared elfcorehdr still holds the old memory regions.
>>> So if the system crash with obsolete elfcorehdr, makedumpfile failed to collect vmcore.
>>>
>>> Sequence of actions done on PowerPC to server the memory hotplug:
>>>
>>>    Initiate memory hot remove
>>>             |
>>>             v
>>>    offline pages
>>>             |
>>>             v
>>>    initiate memory notify call chain
>>>    for MEM_OFFLINE event.
>>>    (same is used for crash update)
>>>             |
>>>             v
>>>    prepare new elfcorehdr for kdump using
>>>    memblock data structure
>>>             |
>>>             v
>>>    update memblock data structure
>>>
>>> How passing memory_notify to arch crash hotplug handler will help?
>>>
>>> memory_notify holds the start PFN and page count, with that we can get
>>> the base address and size of hot unplugged memory and can use the same
>>> to avoid hot unplugged memeory region to get added in the elfcorehdr..
>>>
>>> Thanks,
>>> Sourabh Jain
>>>
>> Sourabh, let's see what Baoquan thinks.
>>
>> Baoquan, are you OK with this request? I once had these parameters to the
>> crash hotplug handler and since they were unused at the time, you asked
>> that I remove them, which I did.
> Sorry to miss this mail. I thought both of you were talking about
> somthing, and didn't notice this question to me.
>
> I think there are two ways to solve the issue Sourabh raised:
> 1) make handle_hotplug_event() get and pass down the memory_notify as
> Sourabh said, or the hp_action, mem_start|size as Eric suggested. I
> have to admit I haven't carefully checked which one is better.
>
> 2) let the current code as is since it's aiming at x86 only. Later
> Sourabh can modify code according to his need on ppc. This can give
> satisfying why on code change each time.
>
> I personally like the 2nd way, while also like seeing 1st one if the
> code change and log is convincing to any reviewer.

Ok let's go with second approach. I will introduce a patch in PowerPC 
series to update the
handle_hotplug_event function signature and justify the change.

Thanks,
Sourabh Jain