-next boot failures during KVM setup
Ard Biesheuvel
ardb at kernel.org
Mon Jun 8 13:56:12 PDT 2026
On Mon, 8 Jun 2026, at 22:18, Marc Zyngier wrote:
> [+ Will, Catalin, Ard]
>
> On Mon, 08 Jun 2026 20:19:37 +0100,
> Mark Brown <broonie at kernel.org> wrote:
>>
>> I'm seeing boot failures on a range of physical arm64 platforms in
>> today's -next. Turning on earlycon it looks like we're getting bad
>> pointer dereferences during KVM initialisation:
>>
>> [ 0.728923] kvm [1]: nv: 570 coarse grained trap handlers
>> [ 0.735138] kvm [1]: nv: 710 fine grained trap handlers
>> [ 0.741326] kvm [1]: IPA Size Limit: 40 bits
>> [ 0.748840] Unable to handle kernel paging request at virtual address ffff00000478e000
>
> That really doesn't look like a duff pointer.
>
>> [ 0.757027] Mem abort info:
>> [ 0.759917] ESR = 0x0000000096000147
>
> Translation fault, level 3. My take is that something is getting
> unmapped.
>
...
> I've reproduced with -next on an A72 platform. But it doesn't happen
> with kvmarm/next on its own. So it is likely something coming from
> another tree that messes up with CMOs, or .
>
> The stack trace here is slightly better:
>
> [ 0.099138] Unable to handle kernel paging request at virtual
> address ffff0023d9ead000
...
> [ 2.136462] Call trace:
> [ 2.138896] dcache_clean_inval_poc+0x24/0x48 (P)
> [ 2.143592] init_hyp_mode+0x644/0x960
> [ 2.147333] kvm_arm_init+0x128/0x280
> [ 2.150987] do_one_initcall+0x4c/0x458
> [ 2.154813] kernel_init_freeable+0x1f4/0x2a0
> [ 2.159161] kernel_init+0x2c/0x150
> [ 2.162642] ret_from_fork+0x10/0x20
> [ 2.166210] Code: 9ac32042 d1000443 8a230000 d503201f (d50b7e20)
> [ 2.172292] ---[ end trace 0000000000000000 ]---
> [ 2.176958] Kernel panic - not syncing: Attempted to kill init!
> exitcode=0x0000000b
> [ 2.184608] SMP: stopping secondary CPUs
> [ 2.188523] Kernel Offset: 0x47dbd5dc0000 from 0xffff800080000000
> [ 2.194604] PHYS_OFFSET: 0x80000000
> [ 2.198080] CPU features: 0x04000000,804b0008,00040001,0400421b
> [ 2.203988] Memory Limit: none
> [ 2.207031] ---[ end Kernel panic - not syncing: Attempted to kill
> init! exitcode=0x0000000b ]---
>
> This points to the following code in kvm_hyp_init_symbols():
>
> <quote>
> /*
> * Flush entire BSS since part of its data containing init symbols is read
> * while the MMU is off.
> */
> kvm_flush_dcache_to_poc(kvm_ksym_ref(__hyp_bss_start),
> kvm_ksym_ref(__hyp_bss_end) - kvm_ksym_ref(__hyp_bss_start))
>
> </quote>
>
> which I suspect is related to some of the new BSS related code in
> arm64/for-next/mm.
>
> Ard, does this ring a bell?
>
Haven't seen this myself, surprisingly, but yeah, this is obviously related.
By now, I am wondering if unmapping that region entirely is really worth the
hassle, or whether we'd be better off just remapping it read-only.
Given we're at -rc7, I'd lean towards dropping the whole branch for now, or
alternatively, only drop/revert "arm64: mm: Unmap kernel data/bss entirely from the
linear map" (and its followup fix "arm64: mm: Defer remap of linear alias of
data/bss") so that the region always remains readable via the linear map.
More information about the linux-arm-kernel
mailing list