[PATCH] Add +~800M crashkernel explaination

Xunlei Pang xpang at redhat.com
Wed Dec 14 15:17:10 PST 2016


On 12/15/2016 at 01:50 AM, Robert LeBlanc wrote:
> On Tue, Dec 13, 2016 at 8:08 PM, Xunlei Pang <xpang at redhat.com> wrote:
>> On 12/10/2016 at 01:20 PM, Robert LeBlanc wrote:
>>> On Fri, Dec 9, 2016 at 7:49 PM, Baoquan He <bhe at redhat.com> wrote:
>>>> On 12/09/16 at 05:22pm, Robert LeBlanc wrote:
>>>>> When trying to configure crashkernel greater than about 800 MB, the
>>>>> kernel fails to allocate memory on x86 and x86_64. This is due to an
>>>>> undocumented limit that the crashkernel and other low memory items must
>>>>> be allocated below 896 MB unless the ",high" option is given. This
>>>>> updates the documentation to explain this and what I understand the
>>>>> limitations to be on the option.
>>>> This is true, but not very accurate. You found it's about 800M, it's
>>>> becasue usually the current kernel need about 40M space to run, and some
>>>> extra reservation before reserve_crashkernel invocation, another ~10M.
>>>> However it's normal case, people may build modules into or have some
>>>> special code to bloat kernel. This patch makes sense to address the
>>>> low|high issue, it might be not good so determined to say ~800M.
>>> My testing showed that I could go anywhere from about 830M to 880M,
>>> depending on distro, kernel version, and stuff that you mentioned. I
>>> just thought some rule of thumb of when to consider using high would
>>> be good. People may not think that 800 MB is 'large' when you have 512
>>> GB of RAM for instance. I thought about making 512 MB be the rule of
>>> thumb, but you can do a lot with ~300 MB.
>> Hi Robert,
>>
>> I think you are correct.
>>
>> For x86, the kernel uses memblock to locate the proper range starts from 16MB to some "end",
>> without "high" prefix, "end" is CRASH_ADDR_LOW_MAX, otherwise CRASH_ADDR_HIGH_MAX.
>>
>> You can find the definition for both 32-bit and 64-bit:
>> #ifdef CONFIG_X86_32
>> # define CRASH_ADDR_LOW_MAX (512 << 20)
>> # define CRASH_ADDR_HIGH_MAX    (512 << 20)
>> #else
>> # define CRASH_ADDR_LOW_MAX (896UL << 20)
>> # define CRASH_ADDR_HIGH_MAX    MAXMEM
>> #endif
>>
>> as some memory was already allocated by the kernel, which means it's highly likely to get a reservation
>> failure after specifying a crashkernel value near 800MB(for x86_64) which was what you met. But we can't
>> get the exact threshold, but it would be better if there is some explanation accordingly in the document.
> To make sure I'm understanding what you are say, you want me to go
> into a bit more detail about the limitation and specify the
> differences between x86 and x86_64, right?

Yeah, it would be better to have one, at least to mention the different upper bounds.

As I replied in another post, if you really want to detail the behaviour, should mention
"crashkernel=size[KMG][@offset[KMG]]" with @offset[KMG] specified explicitly, after
all, it's handled differently with no upper bound limitation, but doing this may put
the first kernel at the risk of lacking low memory(some devices require 32bit DMA),
must use it with care because the kernel will assume users are aware of what they
are doing and make a successful reservation as long as the given range is available.

>
>>> I'm happy to adjust the wording, what would you recommend? Also, I'm
>>> not 100% sure that I got the cases covered correctly. I was surprised
>>> that I could not get it to work with the "new" format with the
>>> multiple ranges, and that specifying an offset would't work either,
>>> although the offset kind of makes sense. Do you know for sure that it
>>> doesn't work with ranges?
>>>
>>> I tried,
>>>
>>> crashkernel=256M-1G:128M,high,1G-4G:256M,high,4G-:512M,high
>>>
>>> and
>>>
>>> crashkernel=256M-1G:128M,1G-4G:256M,4G-:512M,high
>>>
>>> and neither worked. It seems that a better separator would be ';'
>>> instead of ',' for ranges, then you could specify options better. Kind
>>> of hard to change now.
>> For "crashkernel=range1:size1[,range2:size2,...][@offset]"
>> I'm afraid it doesn't support "high" prefix in the current implementation, so there is no guarantee.
>> I guess we can drop a note to eliminate the confusion.
> I tried to express in the extended syntax section that ',high' is not
> available and you have to use the 'simple' format. Do you think this

ditto

> needs to be expanded as well?

If you really have good reasons or use cases, please try it :-)

Regards,
Xunlei

>
>
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
>>>>> Signed-off-by: Robert LeBlanc <robert at leblancnet.us>
>>>>> ---
>>>>>  Documentation/kdump/kdump.txt | 22 +++++++++++++++++-----
>>>>>  1 file changed, 17 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
>>>>> index b0eb27b..aa3efa8 100644
>>>>> --- a/Documentation/kdump/kdump.txt
>>>>> +++ b/Documentation/kdump/kdump.txt
>>>>> @@ -256,7 +256,9 @@ While the "crashkernel=size[@offset]" syntax is sufficient for most
>>>>>  configurations, sometimes it's handy to have the reserved memory dependent
>>>>>  on the value of System RAM -- that's mostly for distributors that pre-setup
>>>>>  the kernel command line to avoid a unbootable system after some memory has
>>>>> -been removed from the machine.
>>>>> +been removed from the machine. If you need to allocate more than ~800M
>>>>> +for x86 or x86_64 then you must use the simple format as the format
>>>>> +',high' conflicts with the separators of ranges.
>>>>>
>>>>>  The syntax is:
>>>>>
>>>>> @@ -282,11 +284,21 @@ Boot into System Kernel
>>>>>  1) Update the boot loader (such as grub, yaboot, or lilo) configuration
>>>>>     files as necessary.
>>>>>
>>>>> -2) Boot the system kernel with the boot parameter "crashkernel=Y at X",
>>>>> +2) Boot the system kernel with the boot parameter "crashkernel=Y[@X | ,high]",
>>>>>     where Y specifies how much memory to reserve for the dump-capture kernel
>>>>> -   and X specifies the beginning of this reserved memory. For example,
>>>>> -   "crashkernel=64M at 16M" tells the system kernel to reserve 64 MB of memory
>>>>> -   starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
>>>>> +   and X specifies the beginning of this reserved memory or ',high' to load in
>>>>> +   high memory. For example, "crashkernel=64M at 16M" tells the system
>>>>> +   kernel to reserve 64 MB of memory starting at physical address
>>>>> +   0x01000000 (16MB) for the dump-capture kernel.
>>>>> +
>>>>> +   Specifying "crashkernel=1G,high" tells the system kernel to reserve 1 GB
>>>>> +   of memory using high memory for the dump-capture kernel, there may also
>>>>> +   be some low memory allocated as well. If you need more than ~800M for
>>>>> +   the crash kernel to operate (volumes on FC/iSCSI, large volumes, systemd
>>>>> +   added to the previous, etc), you need to specify ',high' since without
>>>>> +   it crashkerenel has to try and fit under 896M along with some other
>>>>> +   items and will fail to allocate memory. High memory may only be relevant
>>>>> +   on x86 and x86_64.
>>>>>
>>>>>     On x86 and x86_64, use "crashkernel=64M at 16M".
>>>>>
>>>>> --
>>>>> 2.10.2
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> kexec mailing list
>>>>> kexec at lists.infradead.org
>>>>> http://lists.infradead.org/mailman/listinfo/kexec
>>> ----------------
>>> Robert LeBlanc
>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>>>
>>> _______________________________________________
>>> kexec mailing list
>>> kexec at lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/kexec
> _______________________________________________
> kexec mailing list
> kexec at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec




More information about the kexec mailing list