[PATCH] Put each per-cpu kdump ELF notes into a single page

Vivek Goyal vgoyal at redhat.com
Thu Sep 11 14:16:37 PDT 2014


On Thu, Sep 11, 2014 at 10:43:30PM +0200, Petr Tesarik wrote:
> On Thu, 11 Sep 2014 16:01:10 -0400
> Vivek Goyal <vgoyal at redhat.com> wrote:
> 
> > On Fri, Sep 05, 2014 at 06:33:14PM +0200, Petr Tesarik wrote:
> > > On architectures that use percpu-vm, the percpu region is not guaranteed
> > > to be contiguous in physical space.
> > 
> > Petr,
> > 
> > Which are those arches?
> 
> All except nommu. Actually, percpu-km will be used instead even on MMU
> if SMP is disabled, but since SMP is pretty standard now, I guess the
> vast majority of all kernels out there is affected. ;-)

Hi Petr,

To make sure I understand it correctly I will just summarize what you
said.

alloc_percpu() code does not guarantee that an object will be on physically
contiguous pages if object crosses page boundary. That's why we are forcing
allocation of object aligned to nearest higher power of two boundary of
object size and that way object will always be on same page (as long as object
is not bigger than a page).

Is that a fair summary?

Thanks
Vivek
 
> 
> > > However, fs/proc/vmcore.c expects
> > > all ELF notes to be contiguous. If the ELF note happens to occupy
> > > two non-adjacent physical pages, part of the note may be read from an
> > > incorrect memory location by the kdump kernel, resulting in failure to
> > > initialize /proc/vmcore (if the content of the following physical page,
> > > incorrectly interpreted as an ELF note specifies a large number), wrong
> > > register values or other apparent random memory corruption.
> > > 
> > > There is currently no mechanism to pass the virtual-to-physical mapping
> > > of the percpu allocation to the kdump kernel. So, instead, I'm changing
> > > the alignment of the ELF note buffer. Since sizeof(note_buf_t) is less
> > > than PAGE_SIZE, aligning the buffer to the nearest higher power of 2
> > > is enough to make sure that the buffer cannot cross a page boundary,
> > > effectively ensuring that the whole buffer is contiguous in physical
> > > space.
> > > 
> > > Signed-off-by: Petr Tesarik <ptesarik at suse.cz>
> > > ---
> > >  kernel/kexec.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/kexec.c b/kernel/kexec.c
> > > index 2bee072..cdab59d 100644
> > > --- a/kernel/kexec.c
> > > +++ b/kernel/kexec.c
> > > @@ -1610,7 +1610,8 @@ void crash_save_cpu(struct pt_regs *regs, int cpu)
> > >  static int __init crash_notes_memory_init(void)
> > >  {
> > >  	/* Allocate memory for saving cpu registers. */
> > > -	crash_notes = alloc_percpu(note_buf_t);
> > > +	crash_notes = __alloc_percpu(sizeof(note_buf_t),
> > > +				     roundup_pow_of_two(sizeof(note_buf_t)));
> > 
> > I think some of the changelog should show up here as comment in short
> > form. I don't think it is obvious that why we are using __alloc_percpu()
> > and why aligning to nearst higher power of 2 is needed here. Please also
> > mention here which arches run into issues.
> 
> OK, I'll add it as a comment in the code. I'll see if I can make it
> short but still understandable.
> 
> Thanks,
> Petr Tesarik
> 
> > Thanks
> > Vivek
> > 
> > >  	if (!crash_notes) {
> > >  		pr_warn("Kexec: Memory allocation for saving cpu register states failed\n");
> > >  		return -ENOMEM;
> > > -- 
> > > 1.8.4.5
> > > 
> > > _______________________________________________
> > > kexec mailing list
> > > kexec at lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/kexec
> > 
> > _______________________________________________
> > kexec mailing list
> > kexec at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec



More information about the kexec mailing list