makedumpfile memory usage grows with system memory size

HATAYAMA Daisuke d.hatayama at jp.fujitsu.com
Mon May 14 01:44:28 EDT 2012


From: Atsushi Kumagai <kumagai-atsushi at mxc.nes.nec.co.jp>
Subject: Re: makedumpfile memory usage grows with system memory size
Date: Fri, 27 Apr 2012 16:46:49 +0900

>     - Now, the prototype doesn't support PG_buddy because the value of PG_buddy
>       is different depending on kernel configuration and it isn't stored into 
>       VMCOREINFO. However, I'll extend get_length_of_free_pages() for PG_buddy 
>       when the value of PG_buddy is stored into VMCOREINFO.

Hello Kumagai san,

I'm now investigating how to perform filtering free pages without
kernel debuginfo. For this, I've investigated which of PG_buddy and
_mapcount to use in kernel versions. In the current conclusion, it's
reasonable to do that as shown in the following table.

| kernel version   |  Use PG_buddy? or _mapcount?                             |
|------------------+----------------------------------------------------------|
| 2.6.15 -- 2.6.16 | offsetof(page,_mapcount):=sizeof(ulong)+sizeof(atomic_t) |
| 2.6.17 -- 2.6.26 |        PG_buddy := 19                                    |
| 2.6.27 -- 2.6.36 |        PG_buddy := 18                                    |
| 2.6.37 and later | offsetof(page,_mapcount):= under investigation           |                                           |

In summary: PG_buddy was first introduced at 2.6.17 as 19 to fix some
race bug leading to lru list corruptions, and from 2.6.17 to 2.6.26,
it had been defined using macro preprocessor. At 2.6.27 enum pageflags
was introduced for ease of page flags maintainance and its value
changed to 18. At 2.6.37, it was removed, and it no longer exists in
later kernel versions.

My quick feeling is that solving dependency of PG_buddy is simler than
that of _mapcount from 2.6.17 to 2.6.36.

>From 2.6.15 to 2.6.16, PG_buddy has not been introduced so we need to
rely on _mapcount. It's very complex to solve _mapcount dependency in
general on all supported kernel versions, but only on both kernel
versions, definition of struct page begins with the following
layout. I think it's not so much complex to hardcode offset of
_mapcount for these two kernel versions only: that is, sizeof(unsigned
long) + sizeof(atomic_t) which is in fact struct { volatile int
counter } on all platforms.

struct page {
        unsigned long flags;            /* Atomic flags, some possibly
                                         * updated asynchronously */
        atomic_t _count;                /* Usage count, see below. */
        atomic_t _mapcount;             /* Count of ptes mapped in mms,
...

In the period of PG_buddy is defined as enumeration value, PG_buddy
value depends on CONFIG_PAGEFLAGS_EXTENDED. At commit
e20b8cca760ed2a6abcfe37ef56f2306790db648, PG_head and PG_tail were
introduced and they are positioned before PG_buddy if
CONFIG_PAGEFLAGS_EXTENDED is set; then PG_buddy value becomes
19. However, its users are mips, um and xtensa only as:

  $ git grep "CONFIG_PAGEFLAGS_EXTENDED"
  arch/mips/configs/db1300_defconfig:CONFIG_PAGEFLAGS_EXTENDED=y
  arch/um/defconfig:CONFIG_PAGEFLAGS_EXTENDED=y
  arch/xtensa/configs/iss_defconfig:CONFIG_PAGEFLAGS_EXTENDED=y
  arch/xtensa/configs/s6105_defconfig:CONFIG_PAGEFLAGS_EXTENDED=y
  include/linux/page-flags.h:#ifdef CONFIG_PAGEFLAGS_EXTENDED
  include/linux/page-flags.h:#ifdef CONFIG_PAGEFLAGS_EXTENDED
  mm/memory-failure.c:#ifdef CONFIG_PAGEFLAGS_EXTENDED
  mm/page_alloc.c:#ifdef CONFIG_PAGEFLAGS_EXTENDED

and makedumpfile doesn't support any of these platforms now. So we
don't need to consider this case more.

On 2.6.37 and the later kernels, we must use _mapcount. I'm now
looking into how to get offset of _mapcount in each kernel version
without kernel debug information. But page structure has changed
considerably on recent kernels so I guess the way hardcoding them gets
more complicated.

Anyway, I think it better to add _mapcount information to VMCOREINFO
on upstream as soon as possible.

Thanks.
HATAYAMA, Daisuke




More information about the kexec mailing list