[PATCH v12 7/7] x86/crash: Add x86 crash hotplug support

Eric DeVolder eric.devolder at oracle.com
Fri Oct 7 13:00:52 PDT 2022



On 10/4/22 04:10, Sourabh Jain wrote:
> 
> On 30/09/22 21:06, Eric DeVolder wrote:
>>
>>
>> On 9/28/22 11:07, Borislav Petkov wrote:
>>> On Tue, Sep 13, 2022 at 02:12:31PM -0500, Eric DeVolder wrote:
>>>> This topic was discussed previously https://lkml.org/lkml/2022/3/3/372.
>>>
>>> Please do not use lkml.org to refer to lkml messages. We have a
>>> perfectly fine archival system at lore.kernel.org. You simply do
>>>
>>> https://lore.kernel.org/r/<Message-ID>
>>>
>>> when you want to point to a previous mail.
>>
>> ok, thanks for pointing that out to me.
>>>
>>>> David points out that terminology is tricky here due to differing behaviors.
>>>> And perhaps that is your point in asking for guidance text. It can be
>>>> complicated
>>>
>>> Which means you need an explanation how to use this even more.
>>>
>>> And why is CONFIG_CRASH_MAX_MEMORY_RANGES even a Kconfig item and not
>>> something you discover from the hardware?
>>
>> No, is the short answer.
>>
>>>
>>> Your help text talks about System RAM entries in /proc/iomem which means
>>> that those entries are present somewhere in the kernel and you can read
>>> them out and do the proper calculations dynamically instead of doing the
>>> static CONFIG_NR_CPUS_DEFAULT + CONFIG_CRASH_MAX_MEMORY_RANGES thing.
>>
>> The intent is to compute the max size buffer needed to contain a maximum populated elfcorehdr, 
>> which is primarily based on the number of CPUs and memory regions. Thus far I (and others 
>> involved) have not found a kernel method to determine the maximum number of memory regions 
>> possible (if you are aware of one, please let me know!). Thus CONFIG_CRASH_MAX_MEMORY_RANGES was 
>> born (rather borrowed from kexec-tools).
>>
>> So no dynamic computation is possible, yet.
>>
>>>
>>>> , but it all comes down to System RAM entries.
>>>>
>>>> I could perhaps offer an overly simplified example such that for 1GiB block
>>>> size, for example, the CRASH_MAX_MEMORY_RANGES of 32768 would allow for 32TiB
>>>> of memory?
>>>
>>> Yes, and stick it somewhere in Documentation/admin-guide/kdump/ and
>>> refer to it in that help text so that people can find it and read how to
>>> use your new option.
>>>
>> ok
>>
>>>> The kbuf.bufsz value is obtained via a call to prepare_elf_headers(); I can
>>>> not initialize it at its declaration.
>>>
>>> Sorry, I meant this:
>>>
>>> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
>>> index 8fc7d678ac72..ee6fd9f1b2b9 100644
>>> --- a/arch/x86/kernel/crash.c
>>> +++ b/arch/x86/kernel/crash.c
>>> @@ -395,8 +395,9 @@ int crash_load_segments(struct kimage *image)
>>>       if (ret)
>>>           return ret;
>>>   -    image->elf_headers = kbuf.buffer;
>>> -    image->elf_headers_sz = kbuf.bufsz;
>>> +    image->elf_headers    = kbuf.buffer;
>>> +    image->elf_headers_sz    = kbuf.bufsz;
>>> +    kbuf.memsz        = kbuf.bufsz;
>>>     #if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
>>>       /* Ensure elfcorehdr segment large enough for hotplug changes */
>>> @@ -407,9 +408,8 @@ int crash_load_segments(struct kimage *image)
>>>       image->elf_headers_sz = kbuf.memsz;
>>>       image->elfcorehdr_index = image->nr_segments;
>>>       image->elfcorehdr_index_valid = true;
>>> -#else
>>> -    kbuf.memsz = kbuf.bufsz;
>>>   #endif
>>> +
>>>       kbuf.buf_align = ELF_CORE_HEADER_ALIGN;
>>>       kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>>>       ret = kexec_add_buffer(&kbuf);
>>>
>> ok
>>
>>>> I'm at a loss as to what to do differently here. You've raised this issue
>>>> before and I went back and looked at the suggestions then and I don't see
>>>> how that applies to this situation. How is this situation different than the
>>>> #ifdef CONFIG_KEXEC_FILE that immediately preceeds it?
>>>
>>> See the diff at the end. I'm not saying this is how you should do it
>>> but it should give you a better idea. The logic being, the functions
>>> in the .c file don't really need ifdeffery around them - you're adding
>>> 1-2 functions and crash.c is not that big - so they can be built in
>>> unconditionally. You'd need the ifdeffery *in the header only* when
>>> crash.c is not being built.
>> ok; I've overlooked that scenario.
>>>
>>> But I've done it with ifdeffery in the .c file now because yes, the
>>> kexec code is a minefield of ifdeffery. Hell, there's ifdeffery even in
>>> the headers for structs. Ifdeffery you don't really need. Someone should
>>> clean that up and simplify this immensely.
>>
>> ok
>>
>>>
>>>> Currently there is a concurrent effort for PPC support by Sourabh
>>>> Jain, and in that effort arch_map_crash_pages() is using __va(paddr).
>>>
>>> Why?
>>>
>>>> I do not know the nuances between kmap_local_page() and __va() to
>>>> answer the question.
>>>
>>> kmap_local_page() is a generic interface and it should work on any arch.
>>>
>>> And it is documented even:
>>>
>>> $ git grep kmap_local_page Documentation/
>>>
>>>> If kmap_local_page() works for all archs, then I'm happy to drop these
>>>> arch_ variants and use it directly.
>>>
>>> Yes, pls do.
>>
>> I'll check with Sourabh to see if PPC can work with kmap_local_page().
> I think kmap_local_page do support on  PowerPC. But can you explain why we need this
> function here, aren't the reserve memory already available to use?

On x86, attempts to access the elfcorehdr without mapping it did not work (resulted
in a fault).

Let me know if using kmap_local_page() in place of __va() in arch_map_crash_pages().
If it does, then I can eliminate arch_un/map_crash_pages() and use kmap_local_page()
directly.

Thanks,
eric
> 
> Thanks,
> Sourabh Jain



More information about the kexec mailing list