Widespread boot failures on ARM due to "mm/page_alloc.c: calculate zone_start_pfn at zone_spanned_pages_in_node()"
Andrew Morton
akpm at linux-foundation.org
Mon Jan 4 15:09:46 PST 2016
On Mon, 4 Jan 2016 22:42:33 +0000 Mark Brown <broonie at kernel.org> wrote:
> Since 20151231 -next has been failing to boot on a wide range of ARM
> platforms in the kernelci.org boot tests[1]. Doing bisections with
> Arndale and BeagleBone Black identifies 904769ac82ebf (mm/page_alloc.c:
> calculate zone_start_pfn at zone_spanned_pages_in_node()) from the akpm
> tree as the first broken commit[2,3]. An example bootlog from the
> failure is:
>
> http://storage.kernelci.org/next/next-20151231/arm-multi_v7_defconfig/lab-cambridge/boot-exynos5250-arndale.html
>
> which shows no output on the console once we start the kernel, a brief
> sampling of failing boards suggests this is the normal failure mode.
> x86 and arm64 targets seem fine (juno shows up as failing but the boot
> log seems fine so it's probably a false positive, Mustang was failing
> already) and there are a small number of ARM platforms that boot. I've
> not yet had any time to investigate further than that (including trying
> a revert of that commit), sorry.
>
> [1] http://kernelci.org/boot/all/job/next/kernel/next-20151231/
> [2] https://ci.linaro.org/view/people/job/tbaker-boot-bisect-bot/135/console
> [3] https://ci.linaro.org/view/people/job/tbaker-boot-bisect-bot/136/console
Thanks. That patch has rather a blooper if
CONFIG_HAVE_MEMBLOCK_NODE_MAP=n. Is that the case in your testing?
Arnd's tentative fix is below.
I shall drop that patchset for now.
From: Arnd Bergmann <arnd at arndb.de>
Subject: mm/page_alloc.c: set a zone_start_pfn value in zone_spanned_pages_in_node
We got a new build warning in linux-next:
mm/page_alloc.c: In function 'free_area_init_node':
mm/page_alloc.c:5278:25: warning: 'zone_start_pfn' may be used uninitialized in this function [-Wmaybe-uninitialized]
zone->zone_start_pfn = zone_start_pfn;
mm/page_alloc.c:5265:17: note: 'zone_start_pfn' was declared here
unsigned long zone_start_pfn, zone_end_pfn;
The code indeed looks wrong, but this is just a guess of what the
fix might be: I have not looked it in detail, so please treat this
as a bug report.
Signed-off-by: Arnd Bergmann <arnd at arndb.de>
Cc: Taku Izumi <izumi.taku at jp.fujitsu.com>
Cc: Tony Luck <tony.luck at intel.com>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
---
mm/page_alloc.c | 2 ++
1 file changed, 2 insertions(+)
diff -puN mm/page_alloc.c~mm-calculate-zone_start_pfn-at-zone_spanned_pages_in_node-fix mm/page_alloc.c
--- a/mm/page_alloc.c~mm-calculate-zone_start_pfn-at-zone_spanned_pages_in_node-fix
+++ a/mm/page_alloc.c
@@ -5013,6 +5013,8 @@ static inline unsigned long __meminit zo
unsigned long *zone_end_pfn,
unsigned long *zones_size)
{
+ *zone_start_pfn = node_start_pfn;
+ *zone_end_pfn = node_end_pfn;
return zones_size[zone_type];
}
_
More information about the linux-arm-kernel
mailing list