[PATCH] kdump: force page alignment for per-CPU crash notes.

Eugene Surovegin surovegin at google.com
Wed Feb 29 20:56:54 EST 2012


On Wed, Feb 29, 2012 at 5:51 PM, HATAYAMA Daisuke
<d.hatayama at jp.fujitsu.com> wrote:
> From: Eugene Surovegin <surovegin at google.com>
> Subject: Re: [PATCH] kdump: force page alignment for per-CPU crash notes.
> Date: Wed, 29 Feb 2012 17:39:55 -0800
>
>> On Wed, Feb 29, 2012 at 5:32 PM, Simon Horman <horms at verge.net.au> wrote:
>>> On Wed, Feb 29, 2012 at 05:23:10PM -0800, Eugene Surovegin wrote:
>>>> On Wed, Feb 29, 2012 at 5:18 PM, Simon Horman <horms at verge.net.au> wrote:
>>>>
>>>> > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote:
>>>> > > Per-CPU allocations are not guaranteed to be physically contiguous.
>>>> > > However, kdump kernel and user-space code assumes that per-CPU
>>>> > > memory, used for saving CPU registers on crash, is.
>>>> > > This can cause corrupted /proc/vmcore in some cases - the main
>>>> > > symptom being huge ELF note section.
>>>> > >
>>>> > > Force page alignment for note_buf_t to ensure that this assumption holds.
>>>> >
>>>> > Ouch. I'm surprised there is an allocation on crash, perhaps
>>>> > it could at least be done earlier? And am I right in thinking
>>>> > that this change increases the likely hood that the allocation
>>>> > could fail?
>>>> >
>>>>
>>>> I'm not following. This allocation is done on start-up, not on crash.
>>>> If you cannot allocate this much memory on system boot, I'm not sure what
>>>> else you can do on this system....
>>>
>>> Sorry, my eyes deceived me. You are correct and I agree.
>>>
>>> Is it the case that note_buf_t is never larger than PAGE_SIZE?
>>> If so I your patch looks good to me.
>>
>> Currently, maximum note size is hardcoded in kexec-tools to 1024
>> (MAX_NOTE_BYTES).
>> Usually it's way less. IIRC on x86_64 it's 336 bytes.
>>
>
> This is elf_prstatus and I guess it's mostly equal to registers.
>
> crash> p sizeof(struct elf_prstatus)
> $3 = 336
> crash> ptype struct elf_prstatus
> type = struct elf_prstatus {
>    struct elf_siginfo pr_info;
>    short int pr_cursig;
>    long unsigned int pr_sigpend;
>    long unsigned int pr_sighold;
>    pid_t pr_pid;
>    pid_t pr_ppid;
>    pid_t pr_pgrp;
>    pid_t pr_sid;
>    struct timeval pr_utime;
>    struct timeval pr_stime;
>    struct timeval pr_cutime;
>    struct timeval pr_cstime;
>    elf_gregset_t pr_reg; <-- this
>    int pr_fpvalid;
> }
>
> What kinds of architecture does have so many registers? It's just my
> interest. Or possibly other kinds of notes is written here?

I'm not sure about other archs, but we don't write there anything
except for 'elf_prstatus' and sentinel "final" note.

--
Eugene



More information about the kexec mailing list