makedumpfile: 4.5 kernel commit breaks page filtering

Atsushi Kumagai ats-kumagai at wm.jp.nec.com
Sun Feb 21 20:18:08 PST 2016


Hello Dave,

>Hello Atsushi,
>
>I've recently had a couple 4.5-era vmcores issues reported to me as crash bugs
>because they generate numerous initialization-time errors of the type:
>
>  crash: page excluded: kernel virtual address: ffff880075459000  type: "fill_task_struct"
>
>Initially I thought it was related to this crash-7.1.4 fix that you posted:
>
>    Fix for the handling of dynamically-sized task_struct structures in
>    Linux 4.2 and later kernels, which contain these commits:
>
>      commit 5aaeb5c01c5b6c0be7b7aadbf3ace9f3a4458c3d
>      x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and
>                      use it on x86
>      commit 0c8c0f03e3a292e031596484275c14cf39c0ab7a
>      x86/fpu, sched: Dynamically allocate 'struct fpu'
>
>    Without the patch, when running on a filtered kdump dumpfile, it is
>    possible that error messages like this will be seen when gathering
>    the tasks running on a system: "crash: page excluded: kernel virtual
>    address: <task_struct address>  type: "fill_task_struct".
>    (ats-kumagai at wm.jp.nec.com)
>
>But upon further investigation of a suspect vmcore, there are many other
>"page excluded" errors for several other data structure types.  Joe Lawrence
>of Stratus did some kernel-bisecting, and narrowed it down to this recent
>4.5 commit:
>
>  commit 1c290f642101e64f379e38ea0361d097c08e824d
>  Author: Kirill A. Shutemov <kirill.shutemov at linux.intel.com>
>  Date:   Fri Jan 15 16:52:07 2016 -0800
>
>    mm: sanitize page->mapping for tail pages
>
>    We don't define meaning of page->mapping for tail pages.  Currently it's
>    always NULL, which can be inconsistent with head page and potentially
>    lead to problems.
>
>    Let's poison the pointer to catch all illigal uses.
>
>    page_rmapping(), page_mapping() and page_anon_vma() are changed to look
>    on head page.
>
>    The only illegal use I've caught so far is __GPF_COMP pages from sound
>    subsystem, mapped with PTEs.  do_shared_fault() is changed to use
>    page_rmapping() instead of direct access to fault_page->mapping.
>
>    Signed-off-by: Kirill A. Shutemov <kirill.shutemov at linux.intel.com>
>    Reviewed-by: Jérôme Glisse <jglisse at redhat.com>
>    Cc: Andrea Arcangeli <aarcange at redhat.com>
>    Cc: Hugh Dickins <hughd at google.com>
>    Cc: Dave Hansen <dave.hansen at intel.com>
>    Cc: Mel Gorman <mgorman at suse.de>
>    Cc: Rik van Riel <riel at redhat.com>
>    Cc: Vlastimil Babka <vbabka at suse.cz>
>    Cc: Christoph Lameter <cl at linux.com>
>    Cc: Naoya Horiguchi <n-horiguchi at ah.jp.nec.com>
>    Cc: Steve Capper <steve.capper at linaro.org>
>    Cc: "Aneesh Kumar K.V" <aneesh.kumar at linux.vnet.ibm.com>
>    Cc: Johannes Weiner <hannes at cmpxchg.org>
>    Cc: Michal Hocko <mhocko at suse.cz>
>    Cc: Jerome Marchand <jmarchan at redhat.com>
>    Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
>    Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
>
>And related to the above, and the one that affects makedumpfile, is this one:
>
>  commit 822cdd1152265d87fcfc974e06c3b68f762987fd
>  Author: Kirill A. Shutemov <kirill.shutemov at linux.intel.com>
>  Date:   Fri Jan 15 16:52:03 2016 -0800
>
>    page-flags: look at head page if the flag is encoded in page->mapping
>
>    PageAnon() and PageKsm() look at lower bits of page->mapping to check if
>    the page is Anon or KSM.  page->mapping can be overloaded in tail pages.
>
>    Let's always look at head page to avoid false-positives.
>
>    Signed-off-by: Kirill A. Shutemov <kirill.shutemov at linux.intel.com>
>    Cc: Andrea Arcangeli <aarcange at redhat.com>
>    Cc: Hugh Dickins <hughd at google.com>
>    Cc: Dave Hansen <dave.hansen at intel.com>
>    Cc: Mel Gorman <mgorman at suse.de>
>    Cc: Rik van Riel <riel at redhat.com>
>    Cc: Vlastimil Babka <vbabka at suse.cz>
>    Cc: Christoph Lameter <cl at linux.com>
>    Cc: Naoya Horiguchi <n-horiguchi at ah.jp.nec.com>
>    Cc: Steve Capper <steve.capper at linaro.org>
>    Cc: "Aneesh Kumar K.V" <aneesh.kumar at linux.vnet.ibm.com>
>    Cc: Johannes Weiner <hannes at cmpxchg.org>
>    Cc: Michal Hocko <mhocko at suse.cz>
>    Cc: Jerome Marchand <jmarchan at redhat.com>
>    Cc: Jérôme Glisse <jglisse at redhat.com>
>    Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
>    Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
>
>diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
>index 818fa39..190f191 100644
>--- a/include/linux/page-flags.h
>+++ b/include/linux/page-flags.h
>@@ -176,7 +176,7 @@ static inline int PageCompound(struct page *page)
> #define PF_NO_TAIL(page, enforce) ({                                   \
>                VM_BUG_ON_PGFLAGS(enforce && PageTail(page), page);     \
>                compound_head(page);})
>-#define PF_NO_COMPOUND(page, enforce) ({                                       \
>+#define PF_NO_COMPOUND(page, enforce) ({                               \
>                VM_BUG_ON_PGFLAGS(enforce && PageCompound(page), page); \
>                page;})
>
>@@ -381,6 +381,7 @@ PAGEFLAG(Idle, idle, PF_ANY)
>
> static inline int PageAnon(struct page *page)
> {
>+       page = compound_head(page);
>        return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0;
> }
>
>@@ -393,6 +394,7 @@ static inline int PageAnon(struct page *page)
>  */
> static inline int PageKsm(struct page *page)
> {
>+       page = compound_head(page);
>        return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) ==
>                                (PAGE_MAPPING_ANON | PAGE_MAPPING_KSM);
> }
>
>Note that PAGE_MAPPING_ANON is now only set in the compound_head page,
>so when makedumpfile walks though the pages, it will have to look
>at each page's head page for the bit setting.

Thanks for your report.
As you said, it seems checking the head page like kernel does is necessary.
I'll try to work it out, please give me some time.


Thanks,
Atsushi Kumagai

>As it is now, makedumpfile runs amok filtering pages that still have
>stuff left in page->mapping.  For example, all of the addresses in
>my "filtered.list" input file are those of legitimate kernel data
>structures that have been incorrectly filtered because PAGE_MAPPING_ANON
>(bit 1) has been left set:
>
>crash> kmem -p < filtered.list
>      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
>ffffea0011b29040 46ca41000 dead0000ffffffff        0  0 3ffff800000000
>      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
>ffffea0011b29040 46ca41000 dead0000ffffffff        0  0 3ffff800000000
>      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
>ffffea0011b29640 46ca59000 dead0000ffffffff        0  0 3ffff800000000
>      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
>ffffea0011b29640 46ca59000 dead0000ffffffff        0  0 3ffff800000000
>      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
>ffffea0001d51640  75459000 dead0000ffffffff        0  0 1ffff800000000
>...
>
>In earlier kernels, the page->mapping fields above would not have
>their PAGE_MAPPING_ANON set.
>
>Dave


More information about the kexec mailing list