[PATCH v12 7/7] x86/crash: Add x86 crash hotplug support

Sourabh Jain sourabhjain at linux.ibm.com
Tue Oct 11 21:55:38 PDT 2022


On 08/10/22 01:30, Eric DeVolder wrote:
>
>
> On 10/4/22 04:10, Sourabh Jain wrote:
>>
>> On 30/09/22 21:06, Eric DeVolder wrote:
>>>
>>>
>>> On 9/28/22 11:07, Borislav Petkov wrote:
>>>> On Tue, Sep 13, 2022 at 02:12:31PM -0500, Eric DeVolder wrote:
>>>>> This topic was discussed previously 
>>>>> https://lkml.org/lkml/2022/3/3/372.
>>>>
>>>> Please do not use lkml.org to refer to lkml messages. We have a
>>>> perfectly fine archival system at lore.kernel.org. You simply do
>>>>
>>>> https://lore.kernel.org/r/<Message-ID>
>>>>
>>>> when you want to point to a previous mail.
>>>
>>> ok, thanks for pointing that out to me.
>>>>
>>>>> David points out that terminology is tricky here due to differing 
>>>>> behaviors.
>>>>> And perhaps that is your point in asking for guidance text. It can be
>>>>> complicated
>>>>
>>>> Which means you need an explanation how to use this even more.
>>>>
>>>> And why is CONFIG_CRASH_MAX_MEMORY_RANGES even a Kconfig item and not
>>>> something you discover from the hardware?
>>>
>>> No, is the short answer.
>>>
>>>>
>>>> Your help text talks about System RAM entries in /proc/iomem which 
>>>> means
>>>> that those entries are present somewhere in the kernel and you can 
>>>> read
>>>> them out and do the proper calculations dynamically instead of 
>>>> doing the
>>>> static CONFIG_NR_CPUS_DEFAULT + CONFIG_CRASH_MAX_MEMORY_RANGES thing.
>>>
>>> The intent is to compute the max size buffer needed to contain a 
>>> maximum populated elfcorehdr, which is primarily based on the number 
>>> of CPUs and memory regions. Thus far I (and others involved) have 
>>> not found a kernel method to determine the maximum number of memory 
>>> regions possible (if you are aware of one, please let me know!). 
>>> Thus CONFIG_CRASH_MAX_MEMORY_RANGES was born (rather borrowed from 
>>> kexec-tools).
>>>
>>> So no dynamic computation is possible, yet.
>>>
>>>>
>>>>> , but it all comes down to System RAM entries.
>>>>>
>>>>> I could perhaps offer an overly simplified example such that for 
>>>>> 1GiB block
>>>>> size, for example, the CRASH_MAX_MEMORY_RANGES of 32768 would 
>>>>> allow for 32TiB
>>>>> of memory?
>>>>
>>>> Yes, and stick it somewhere in Documentation/admin-guide/kdump/ and
>>>> refer to it in that help text so that people can find it and read 
>>>> how to
>>>> use your new option.
>>>>
>>> ok
>>>
>>>>> The kbuf.bufsz value is obtained via a call to 
>>>>> prepare_elf_headers(); I can
>>>>> not initialize it at its declaration.
>>>>
>>>> Sorry, I meant this:
>>>>
>>>> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
>>>> index 8fc7d678ac72..ee6fd9f1b2b9 100644
>>>> --- a/arch/x86/kernel/crash.c
>>>> +++ b/arch/x86/kernel/crash.c
>>>> @@ -395,8 +395,9 @@ int crash_load_segments(struct kimage *image)
>>>>       if (ret)
>>>>           return ret;
>>>>   -    image->elf_headers = kbuf.buffer;
>>>> -    image->elf_headers_sz = kbuf.bufsz;
>>>> +    image->elf_headers    = kbuf.buffer;
>>>> +    image->elf_headers_sz    = kbuf.bufsz;
>>>> +    kbuf.memsz        = kbuf.bufsz;
>>>>     #if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
>>>>       /* Ensure elfcorehdr segment large enough for hotplug changes */
>>>> @@ -407,9 +408,8 @@ int crash_load_segments(struct kimage *image)
>>>>       image->elf_headers_sz = kbuf.memsz;
>>>>       image->elfcorehdr_index = image->nr_segments;
>>>>       image->elfcorehdr_index_valid = true;
>>>> -#else
>>>> -    kbuf.memsz = kbuf.bufsz;
>>>>   #endif
>>>> +
>>>>       kbuf.buf_align = ELF_CORE_HEADER_ALIGN;
>>>>       kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>>>>       ret = kexec_add_buffer(&kbuf);
>>>>
>>> ok
>>>
>>>>> I'm at a loss as to what to do differently here. You've raised 
>>>>> this issue
>>>>> before and I went back and looked at the suggestions then and I 
>>>>> don't see
>>>>> how that applies to this situation. How is this situation 
>>>>> different than the
>>>>> #ifdef CONFIG_KEXEC_FILE that immediately preceeds it?
>>>>
>>>> See the diff at the end. I'm not saying this is how you should do it
>>>> but it should give you a better idea. The logic being, the functions
>>>> in the .c file don't really need ifdeffery around them - you're adding
>>>> 1-2 functions and crash.c is not that big - so they can be built in
>>>> unconditionally. You'd need the ifdeffery *in the header only* when
>>>> crash.c is not being built.
>>> ok; I've overlooked that scenario.
>>>>
>>>> But I've done it with ifdeffery in the .c file now because yes, the
>>>> kexec code is a minefield of ifdeffery. Hell, there's ifdeffery 
>>>> even in
>>>> the headers for structs. Ifdeffery you don't really need. Someone 
>>>> should
>>>> clean that up and simplify this immensely.
>>>
>>> ok
>>>
>>>>
>>>>> Currently there is a concurrent effort for PPC support by Sourabh
>>>>> Jain, and in that effort arch_map_crash_pages() is using __va(paddr).
>>>>
>>>> Why?
>>>>
>>>>> I do not know the nuances between kmap_local_page() and __va() to
>>>>> answer the question.
>>>>
>>>> kmap_local_page() is a generic interface and it should work on any 
>>>> arch.
>>>>
>>>> And it is documented even:
>>>>
>>>> $ git grep kmap_local_page Documentation/
>>>>
>>>>> If kmap_local_page() works for all archs, then I'm happy to drop 
>>>>> these
>>>>> arch_ variants and use it directly.
>>>>
>>>> Yes, pls do.
>>>
>>> I'll check with Sourabh to see if PPC can work with kmap_local_page().
>> I think kmap_local_page do support on  PowerPC. But can you explain 
>> why we need this
>> function here, aren't the reserve memory already available to use?
>
> On x86, attempts to access the elfcorehdr without mapping it did not 
> work (resulted
> in a fault).
>
> Let me know if using kmap_local_page() in place of __va() in 
> arch_map_crash_pages().
> If it does, then I can eliminate arch_un/map_crash_pages() and use 
> kmap_local_page()
> directly.
Hello Eric,

Atleast on ppc64 we have direct mapping available and hence just by 
doing page shift
on physical address (__va) we can get valid virtual address on powerpc. 
In short we don't
have to generate mapping again to access reserved region.

Regardless let's go with kdump_local_page API, it is supported on powerpc.

Thanks,
Sourabh Jain



More information about the kexec mailing list