makedumpfile: 4.5 kernel commit breaks page filtering

Dave Anderson anderson at redhat.com
Thu Feb 18 11:24:36 PST 2016



----- Original Message -----
> Dave, I confirmed that if I use -d17 for makedumpfile I can then capture a
> usable core.
> I am on 4.5.0-0.rc3.git3.1
> Thanks
> Laurence
> 
> Laurence Oberman
> Principal Software Maintenance Engineer
> Red Hat Global Support Services

OK, good.  And actually, you might be able to get away with filtering
cache-with-private and cache-without-private pages as well.  Looking
further at the makedumpfile code, only user-pages check for bit 1 to 
be set.  The two page-cache variants look for it to be 0:

                /*
                 * Exclude the non-private cache page.
                 */
                else if ((info->dump_level & DL_EXCLUDE_CACHE)
                    && (isLRU(flags) || isSwapCache(flags))
                    && !isPrivate(flags) && !isAnon(mapping)) {
                        pfn_counter = &pfn_cache;
                }
                /*
                 * Exclude the cache page whether private or non-private.
                 */
                else if ((info->dump_level & DL_EXCLUDE_CACHE_PRI)
                    && (isLRU(flags) || isSwapCache(flags))
                    && !isAnon(mapping)) {
                        if (isPrivate(flags))
                                pfn_counter = &pfn_cache_private;
                        else
                                pfn_counter = &pfn_cache;
                }
                /*
                 * Exclude the data page of the user process.
                 *  - anonymous pages
                 *  - hugetlbfs pages
                 */
                else if ((info->dump_level & DL_EXCLUDE_USER_DATA)
                         && (isAnon(mapping) || isHugetlb(compound_dtor))) {
                        pfn_counter = &pfn_user;
                }

The isAnon() function looks like this:

  static inline int
  isAnon(unsigned long mapping) 
  {
          return ((unsigned long)mapping & PAGE_MAPPING_ANON) != 0;
  }

Note above that only DL_EXCLUDE_USER_DATA uses isAnon(), whereas the
other two use !isAnon().  So if my logic is correct, if you try to
filter out page-cache pages as well -- i.e., with "-d23" -- worst case
it may result in some pages *not* being filtered.  And I'm not even
sure of that, given the page->flags checks that go along with it.

Dave
































> 
> ----- Original Message -----
> From: "Dave Anderson" <anderson at redhat.com>
> To: ats-kumagai at wm.jp.nec.com
> Cc: kexec at lists.infradead.org, "Discussion list for crash utility usage,
> maintenance and development" <crash-utility at redhat.com>, "Joe Lawrence"
> <joe.lawrence at stratus.com>, "Laurence Oberman" <loberman at redhat.com>
> Sent: Thursday, February 18, 2016 12:05:11 PM
> Subject: makedumpfile: 4.5 kernel commit breaks page filtering
> 
> 
> 
> Hello Atsushi,
> 
> I've recently had a couple 4.5-era vmcores issues reported to me as crash
> bugs
> because they generate numerous initialization-time errors of the type:
> 
>   crash: page excluded: kernel virtual address: ffff880075459000  type:
>   "fill_task_struct"
> 
> Initially I thought it was related to this crash-7.1.4 fix that you posted:
> 
>     Fix for the handling of dynamically-sized task_struct structures in
>     Linux 4.2 and later kernels, which contain these commits:
>     
>       commit 5aaeb5c01c5b6c0be7b7aadbf3ace9f3a4458c3d
>       x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and
>                       use it on x86
>       commit 0c8c0f03e3a292e031596484275c14cf39c0ab7a
>       x86/fpu, sched: Dynamically allocate 'struct fpu'
>     
>     Without the patch, when running on a filtered kdump dumpfile, it is
>     possible that error messages like this will be seen when gathering
>     the tasks running on a system: "crash: page excluded: kernel virtual
>     address: <task_struct address>  type: "fill_task_struct".
>     (ats-kumagai at wm.jp.nec.com)
> 
> But upon further investigation of a suspect vmcore, there are many other
> "page excluded" errors for several other data structure types.  Joe Lawrence
> of Stratus did some kernel-bisecting, and narrowed it down to this recent
> 4.5 commit:
> 
>   commit 1c290f642101e64f379e38ea0361d097c08e824d
>   Author: Kirill A. Shutemov <kirill.shutemov at linux.intel.com>
>   Date:   Fri Jan 15 16:52:07 2016 -0800
> 
>     mm: sanitize page->mapping for tail pages
>     
>     We don't define meaning of page->mapping for tail pages.  Currently it's
>     always NULL, which can be inconsistent with head page and potentially
>     lead to problems.
>     
>     Let's poison the pointer to catch all illigal uses.
>     
>     page_rmapping(), page_mapping() and page_anon_vma() are changed to look
>     on head page.
>     
>     The only illegal use I've caught so far is __GPF_COMP pages from sound
>     subsystem, mapped with PTEs.  do_shared_fault() is changed to use
>     page_rmapping() instead of direct access to fault_page->mapping.
>     
>     Signed-off-by: Kirill A. Shutemov <kirill.shutemov at linux.intel.com>
>     Reviewed-by: Jérôme Glisse <jglisse at redhat.com>
>     Cc: Andrea Arcangeli <aarcange at redhat.com>
>     Cc: Hugh Dickins <hughd at google.com>
>     Cc: Dave Hansen <dave.hansen at intel.com>
>     Cc: Mel Gorman <mgorman at suse.de>
>     Cc: Rik van Riel <riel at redhat.com>
>     Cc: Vlastimil Babka <vbabka at suse.cz>
>     Cc: Christoph Lameter <cl at linux.com>
>     Cc: Naoya Horiguchi <n-horiguchi at ah.jp.nec.com>
>     Cc: Steve Capper <steve.capper at linaro.org>
>     Cc: "Aneesh Kumar K.V" <aneesh.kumar at linux.vnet.ibm.com>
>     Cc: Johannes Weiner <hannes at cmpxchg.org>
>     Cc: Michal Hocko <mhocko at suse.cz>
>     Cc: Jerome Marchand <jmarchan at redhat.com>
>     Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
> 
> And related to the above, and the one that affects makedumpfile, is this one:
> 
>   commit 822cdd1152265d87fcfc974e06c3b68f762987fd
>   Author: Kirill A. Shutemov <kirill.shutemov at linux.intel.com>
>   Date:   Fri Jan 15 16:52:03 2016 -0800
> 
>     page-flags: look at head page if the flag is encoded in page->mapping
>     
>     PageAnon() and PageKsm() look at lower bits of page->mapping to check if
>     the page is Anon or KSM.  page->mapping can be overloaded in tail pages.
>     
>     Let's always look at head page to avoid false-positives.
>     
>     Signed-off-by: Kirill A. Shutemov <kirill.shutemov at linux.intel.com>
>     Cc: Andrea Arcangeli <aarcange at redhat.com>
>     Cc: Hugh Dickins <hughd at google.com>
>     Cc: Dave Hansen <dave.hansen at intel.com>
>     Cc: Mel Gorman <mgorman at suse.de>
>     Cc: Rik van Riel <riel at redhat.com>
>     Cc: Vlastimil Babka <vbabka at suse.cz>
>     Cc: Christoph Lameter <cl at linux.com>
>     Cc: Naoya Horiguchi <n-horiguchi at ah.jp.nec.com>
>     Cc: Steve Capper <steve.capper at linaro.org>
>     Cc: "Aneesh Kumar K.V" <aneesh.kumar at linux.vnet.ibm.com>
>     Cc: Johannes Weiner <hannes at cmpxchg.org>
>     Cc: Michal Hocko <mhocko at suse.cz>
>     Cc: Jerome Marchand <jmarchan at redhat.com>
>     Cc: Jérôme Glisse <jglisse at redhat.com>
>     Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
> 
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 818fa39..190f191 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -176,7 +176,7 @@ static inline int PageCompound(struct page *page)
>  #define PF_NO_TAIL(page, enforce) ({                                   \
>                 VM_BUG_ON_PGFLAGS(enforce && PageTail(page), page);     \
>                 compound_head(page);})
> -#define PF_NO_COMPOUND(page, enforce) ({
> \
> +#define PF_NO_COMPOUND(page, enforce) ({                               \
>                 VM_BUG_ON_PGFLAGS(enforce && PageCompound(page), page); \
>                 page;})
>  
> @@ -381,6 +381,7 @@ PAGEFLAG(Idle, idle, PF_ANY)
>  
>  static inline int PageAnon(struct page *page)
>  {
> +       page = compound_head(page);
>         return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0;
>  }
>  
> @@ -393,6 +394,7 @@ static inline int PageAnon(struct page *page)
>   */
>  static inline int PageKsm(struct page *page)
>  {
> +       page = compound_head(page);
>         return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) ==
>                                 (PAGE_MAPPING_ANON | PAGE_MAPPING_KSM);
>  }
> 
> Note that PAGE_MAPPING_ANON is now only set in the compound_head page,
> so when makedumpfile walks though the pages, it will have to look
> at each page's head page for the bit setting.
> 
> As it is now, makedumpfile runs amok filtering pages that still have
> stuff left in page->mapping.  For example, all of the addresses in
> my "filtered.list" input file are those of legitimate kernel data
> structures that have been incorrectly filtered because PAGE_MAPPING_ANON
> (bit 1) has been left set:
> 
> crash> kmem -p < filtered.list
>       PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
> ffffea0011b29040 46ca41000 dead0000ffffffff        0  0 3ffff800000000
>       PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
> ffffea0011b29040 46ca41000 dead0000ffffffff        0  0 3ffff800000000
>       PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
> ffffea0011b29640 46ca59000 dead0000ffffffff        0  0 3ffff800000000
>       PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
> ffffea0011b29640 46ca59000 dead0000ffffffff        0  0 3ffff800000000
>       PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
> ffffea0001d51640  75459000 dead0000ffffffff        0  0 1ffff800000000
> ...
> 
> In earlier kernels, the page->mapping fields above would not have
> their PAGE_MAPPING_ANON set.
> 
> Dave
> 
> 



More information about the kexec mailing list