[PATCH v2 29/45] KVM: arm64: GICv3: Set ICH_HCR_EL2.TDIR when interrupts overflow LR capacity
Marc Zyngier
maz at kernel.org
Mon Nov 24 05:40:35 PST 2025
On Mon, 24 Nov 2025 13:23:08 +0000,
Mark Brown <broonie at kernel.org> wrote:
>
> [1 <text/plain; us-ascii (7bit)>]
> On Mon, Nov 24, 2025 at 01:06:29PM +0000, Marc Zyngier wrote:
> > Mark Brown <broonie at kernel.org> wrote:
>
> > > FWIW I am seeing this on i.MX8MP (4xA53+GICv3):
>
> > > https://lava.sirena.org.uk/scheduler/job/2118713#L1044
>
> > There are worrying errors way before that, in the VMID allocator init,
> > and I can't see what the GIC has to do with it. The issue Fuad
> > reported was at run time, not boot time. so this really doesn't align
> > with what you are seeing.
>
> Yeah, I was just looking further and realising it was probably
> different - sorry about that. I was checking what else was failing
> after seeing the qemu issue he was, all the platforms aren't booting one
> way or another. FWIW with earlycon on the AM625 is showing similar
> issues to the i.MX8MP.
That's the initial warning:
WARN_ON(NUM_USER_VMIDS - 1 <= num_possible_cpus());
The register state:
[ 224.378174] pc : kvm_arm_vmid_alloc_init+0xa0/0xc0
[ 224.382954] lr : kvm_arm_vmid_alloc_init+0x24/0xc0
[ 224.387734] sp : ffff80008009bd40
[ 224.391035] x29: ffff80008009bd40 x28: ffff0020209bd3c0 x27: ffffce5349159068
[ 224.398162] x26: ffffce5349070118 x25: ffffce5348fb8eb8 x24: ffffce5349059128
[ 224.405287] x23: 0000000000000109 x22: ffff0020208ea6c0 x21: 0000000000000004
[ 224.412413] x20: ffffce5349c20b78 x19: 0000000000000000 x18: 00000000ffffffff
[ 224.419538] x17: 00000000e9a61a0d x16: 00000000b1c06f2c x15: 00000000ffffffff
[ 224.426663] x14: 0000000000000000 x13: 7374696220343420 x12: 3a74696d694c2065
[ 224.433789] x11: ffffffffffe00000 x10: ffff00275c260000 x9 : ffffce5348048be0
[ 224.440914] x8 : 00000000fffeffff x7 : ffff00275c260000 x6 : 80000000ffff0000
[ 224.448039] x5 : 0000000000000048 x4 : 0000000000000110 x3 : ffffce5348fc1000
[ 224.455164] x2 : 0000000000000100 x1 : 0000000000000100 x0 : 00000000000000ff
The disassembly:
ffff8000816ff220 <kvm_arm_vmid_alloc_init>:
ffff8000816ff220: d503201f nop
ffff8000816ff224: d503201f nop
ffff8000816ff228: d503233f paciasp
ffff8000816ff22c: a9be7bfd stp x29, x30, [sp, #-32]!
ffff8000816ff230: 5280e400 mov w0, #0x720 // #1824
ffff8000816ff234: 910003fd mov x29, sp
ffff8000816ff238: 72a00300 movk w0, #0x18, lsl #16
ffff8000816ff23c: f9000bf3 str x19, [sp, #16]
ffff8000816ff240: 97a4a61c bl ffff800080028ab0 <read_sanitised_ftr_reg>
ffff8000816ff244: d3441c00 ubfx x0, x0, #4, #4
ffff8000816ff248: d0fffa02 adrp x2, ffff800081641000 <rodata_full>
ffff8000816ff24c: d0fffa03 adrp x3, ffff800081641000 <rodata_full>
ffff8000816ff250: f100081f cmp x0, #0x2
ffff8000816ff254: 52800201 mov w1, #0x10 // #16
ffff8000816ff258: b940f044 ldr w4, [x2, #240]
ffff8000816ff25c: 52800102 mov w2, #0x8 // #8
ffff8000816ff260: d29fffe0 mov x0, #0xffff // #65535
ffff8000816ff264: 1a820021 csel w1, w1, w2, eq // eq = none
ffff8000816ff268: d2801fe2 mov x2, #0xff // #255
ffff8000816ff26c: b9005061 str w1, [x3, #80]
ffff8000816ff270: 9a820000 csel x0, x0, x2, eq // eq = none
ffff8000816ff274: d2802001 mov x1, #0x100 // #256
ffff8000816ff278: d2a00022 mov x2, #0x10000 // #65536
ffff8000816ff27c: 9a810042 csel x2, x2, x1, eq // eq = none
ffff8000816ff280: eb00009f cmp x4, x0
ffff8000816ff284: 540001e2 b.cs ffff8000816ff2c0 <kvm_arm_vmid_alloc_init+0xa0> // b.hs, b.nlast
That's the branch to the...
[...]
ffff8000816ff2c0: d4210000 brk #0x800
... BRK instruction.
So x0=255 and x4=272. 272 possible CPUs on a machine with only 16?
Bollocks.
Something is badly screwed in -next, and I'm not convinced it is KVM.
d0f23ccf6ba9e cpumask: Cache num_possible_cpus()
is my current suspect.
M.
--
Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list