Widespread boot failures on ARM due to "mm/page_alloc.c: calculate zone_start_pfn at zone_spanned_pages_in_node()"

Andrew Morton akpm at linux-foundation.org
Mon Jan 4 15:09:46 PST 2016


On Mon, 4 Jan 2016 22:42:33 +0000 Mark Brown <broonie at kernel.org> wrote:

> Since 20151231 -next has been failing to boot on a wide range of ARM
> platforms in the kernelci.org boot tests[1].  Doing bisections with
> Arndale and BeagleBone Black identifies 904769ac82ebf (mm/page_alloc.c:
> calculate zone_start_pfn at zone_spanned_pages_in_node()) from the akpm
> tree as the first broken commit[2,3].  An example bootlog from the
> failure is:
> 
>    http://storage.kernelci.org/next/next-20151231/arm-multi_v7_defconfig/lab-cambridge/boot-exynos5250-arndale.html
> 
> which shows no output on the console once we start the kernel, a brief
> sampling of failing boards suggests this is the normal failure mode.
> x86 and arm64 targets seem fine (juno shows up as failing but the boot
> log seems fine so it's probably a false positive, Mustang was failing
> already) and there are a small number of ARM platforms that boot.  I've
> not yet had any time to investigate further than that (including trying
> a revert of that commit), sorry.
> 
> [1] http://kernelci.org/boot/all/job/next/kernel/next-20151231/
> [2] https://ci.linaro.org/view/people/job/tbaker-boot-bisect-bot/135/console
> [3] https://ci.linaro.org/view/people/job/tbaker-boot-bisect-bot/136/console

Thanks.  That patch has rather a blooper if
CONFIG_HAVE_MEMBLOCK_NODE_MAP=n.  Is that the case in your testing?

Arnd's tentative fix is below.

I shall drop that patchset for now.






From: Arnd Bergmann <arnd at arndb.de>
Subject: mm/page_alloc.c: set a zone_start_pfn value in zone_spanned_pages_in_node

We got a new build warning in linux-next:

mm/page_alloc.c: In function 'free_area_init_node':
mm/page_alloc.c:5278:25: warning: 'zone_start_pfn' may be used uninitialized in this function [-Wmaybe-uninitialized]
    zone->zone_start_pfn = zone_start_pfn;
mm/page_alloc.c:5265:17: note: 'zone_start_pfn' was declared here
   unsigned long zone_start_pfn, zone_end_pfn;

The code indeed looks wrong, but this is just a guess of what the
fix might be: I have not looked it in detail, so please treat this
as a bug report.

Signed-off-by: Arnd Bergmann <arnd at arndb.de>
Cc: Taku Izumi <izumi.taku at jp.fujitsu.com>
Cc: Tony Luck <tony.luck at intel.com>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
---

 mm/page_alloc.c |    2 ++
 1 file changed, 2 insertions(+)

diff -puN mm/page_alloc.c~mm-calculate-zone_start_pfn-at-zone_spanned_pages_in_node-fix mm/page_alloc.c
--- a/mm/page_alloc.c~mm-calculate-zone_start_pfn-at-zone_spanned_pages_in_node-fix
+++ a/mm/page_alloc.c
@@ -5013,6 +5013,8 @@ static inline unsigned long __meminit zo
 					unsigned long *zone_end_pfn,
 					unsigned long *zones_size)
 {
+	*zone_start_pfn = node_start_pfn;
+	*zone_end_pfn = node_end_pfn;
 	return zones_size[zone_type];
 }
 
_




More information about the linux-arm-kernel mailing list