Qemu v9.0.2: Boot failed qemu-arm with Linux next-20241017 tag.

Arnd Bergmann arnd at arndb.de
Wed Oct 23 09:24:43 PDT 2024


On Sun, Oct 20, 2024, at 17:39, Naresh Kamboju wrote:
> On Fri, 18 Oct 2024 at 12:35, Naresh Kamboju <naresh.kamboju at linaro.org> wrote:
>>
>> The QEMU-ARMv7 boot has failed with the Linux next-20241017 tag.
>> The boot log is incomplete, and no kernel crash was detected.
>> However, the system did not proceed far enough to reach the login prompt.
>>

> Anders bisected this boot regressions and found,
> # first bad commit:
>   [efe8419ae78d65e83edc31aad74b605c12e7d60c]
>     vdso: Introduce vdso/page.h
>
> We are investigating the reason for boot failure due to this commit.

Anders and I did the analysis on this, the problem turned out
to be the early_init_dt_add_memory_arch() function in
drivers/of/fdt.c, which does bitwise operations on PAGE_MASK
with a 'u64' instead of phys_addr_t:

void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size)
{
        const u64 phys_offset = MIN_MEMBLOCK_ADDR;
 
        if (size < PAGE_SIZE - (base & ~PAGE_MASK)) {
                pr_warn("Ignoring memory block 0x%llx - 0x%llx\n",
                        base, base + size);
                return;
        }

        if (!PAGE_ALIGNED(base)) {
                size -= PAGE_SIZE - (base & ~PAGE_MASK);
                base = PAGE_ALIGN(base);
        }

On non-LPAE arm32, this broke the existing behavior for
large 32-bit memory sizes. The obvious fix is to change
back the PAGE_MASK definition for 32-bit arm to a signed
number.

mips32, ppc32 and hexagon had the same definition as
well, so I think we should change at least those in order
to restore the previous behavior in case they are affected
by the same bug (or a different one).

x86-32 and arc git flipped the other way by the patch,
from unsigned to signed, when CONFIG_ARC_HAS_PAE40
or CONFIG_X86_PAE are set. I think we should keep
the 'signed' behavior as this was a bugfix by itself,
but we may want to change arc and x86-32 with short
phys_addr_t the same way for consistency.

On csky, m68k, microblaze, nios2, openrisc, parisc32,
riscv32, sh, sparc32, um and xtensa, we've always used
the 'unsigned' PAGE_MASK, and there is no 64-bit
phys_addr_t, so I would lean towards staying with
'unsigned' in order to not introduce a regression.
Alternatively we could choose to go with the 'signed'
version on all 32-bit architectures unconditionally
for consistency. Any preferences?

      Arnd



More information about the linux-arm-kernel mailing list