4.15-rc1 crash on Midway in teardown_hyp_mode()

Andre Przywara andre.przywara at arm.com
Wed Dec 6 06:11:02 PST 2017


Hi,

while trying to boot 4.15-rc1 on my Calxeda Midway I observed a crash
(see below). I can't look further into this today, but wanted to report
this anyway.

Digging around a bit this is due to the VGIC not initializing properly
due to GICC being advertised as just 4K, not 8K.
This can be worked around by adjusting the DT or using
irqchip.gicv2_force_probe. However this still raises some questions:
1) Even if the VGIC fails to register, we should certainly not crash.
The chain of events seems to be:
virt/kvm/arm/arm.c:init_subsystems():
  - kvm_vgic_hyp_init() returns -ENODEV, this leads to vgic_present
    being set to false, but "err" being reset to 0 (meaning: carry on).
    However this seems now to miss some initialization.
  - kvm_timer_hyp_init() now fails on calling irq_set_vcpu_affinity(),
    because this returns -ENOSYS. This leads to it returning this error,
    init_subsystems() failing and subsequently tearing down KVM.
  - This seems to have some bug and leads to the kernel crash.

Even with the VGIC not being usable, we should be able to cleanly tear
down KVM (or HYP?).

2) Is it intended that an unusable VGIC now denies KVM entirely? I
believe in the past we could live with that (no arch timer
virtualization, no in-kernel GIC emulation) and rely on userland
emulation (for instance in QEMU). This seemed to have changed now?
3) Wouldn't it be smarter to fix up the GICC range by default, if we
have enough evidence that the GICC is actually 8K? Shouldn't this be
true for every architecture compliant GICv2, actually? So whenever we
see "arm,cortex-a15-gic", for instance, we force GICC to 8K?
Or do we know of GICs which have only 4K, but advertise themselves
wrongly? Otherwise this could just go as some firmware quirk, based on a
compatible string, for instance, or some ID registers.

The reason I am asking is that the Midway loads the DT from firmware
flash, and this one hasn't changed in years (for obvious reasons). So
while *I* am able to update the DT in the SPI flash, I guess many users
just won't do so, so they are left with a crashing kernel (or loosing
KVM), starting from 4.15. All the previous kernels booted and ran KVM
guests fine in the past with the existing DT.

Cheers,
Andre.

----------------
Booting Linux on physical CPU 0x0
Linux version 4.15.0-rc2-00174-g328b4ed93b69-dirty
(andprz01 at e104803-lin) (gcc version 5.3.0 (GCC)) #614 SMP Wed Dec 6
12:19:02 GMT 2017
CPU: ARMv7 Processor [413fc0f2] revision 2 (ARMv7), cr=30c5387d
CPU: div instructions available: patching division code
CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
OF: fdt: Machine model: Calxeda ECX-2000
Memory policy: Data cache writealloc
efi: Getting EFI parameters from FDT:
efi: UEFI not found.
cma: Reserved 64 MiB at 0x00000000fb800000
psci: probing for conduit method from DT.
psci: Using PSCI v0.1 Function IDs from DT
percpu: Embedded 18 pages/cpu @(ptrval) s42240 r8192 d23296 u73728
Built 1 zonelists, mobility grouping on.  Total pages: 2093568
Kernel command line: console=ttyAMA0,115200n8 ro root=/dev/sda7
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 8225660K/8380416K available (12288K kernel code, 1565K rwdata,
4052K rodata, 2048K init, 564K bss, 89220K reserved, 65536K
cma-reserved, 7528448K highmem)
Virtual kernel memory layout:
    vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    fixmap  : 0xffc00000 - 0xfff00000   (3072 kB)
    vmalloc : 0xf0800000 - 0xff800000   ( 240 MB)
    lowmem  : 0xc0000000 - 0xf0000000   ( 768 MB)
    pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
    modules : 0xbf000000 - 0xbfe00000   (  14 MB)
      .text : 0x(ptrval) - 0x(ptrval)   (14304 kB)
      .init : 0x(ptrval) - 0x(ptrval)   (2048 kB)
      .data : 0x(ptrval) - 0x(ptrval)   (1566 kB)
       .bss : 0x(ptrval) - 0x(ptrval)   ( 565 kB)
SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
ftrace: allocating 46629 entries in 137 pages
Hierarchical RCU implementation.
        RCU restricting CPUs from NR_CPUS=16 to nr_cpu_ids=4.
RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
GIC: GICv2 detected, but range too small and irqchip.gicv2_force_probe
not set
arch_timer: cp15 timer(s) running at 150.00MHz (phys).
clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles:
0x2298375bd0, max_idle_ns: 440795208267 ns
sched_clock: 56 bits at 150MHz, resolution 6ns, wraps every 2199023255551ns
Switching to timer-based delay loop, resolution 6ns
clocksource: arm,sp804: mask: 0xffffffff max_cycles: 0xffffffff,
max_idle_ns: 12741736309 ns
sched_clock: 32 bits at 150MHz, resolution 6ns, wraps every 14316557820ns
Console: colour dummy device 80x30
Calibrating delay loop (skipped), value calculated using timer
frequency.. 300.00 BogoMIPS (lpj=1500000)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 2048 (order: 1, 8192 bytes)
Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes)
CPU: Testing write buffer coherency: ok
/cpus/cpu at 0 missing clock-frequency property
/cpus/cpu at 1 missing clock-frequency property
/cpus/cpu at 2 missing clock-frequency property
/cpus/cpu at 3 missing clock-frequency property
CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
Setting up static identity map for 0x400000 - 0x400178
Hierarchical SRCU implementation.
EFI services will not be available.
smp: Bringing up secondary CPUs ...
CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
CPU2: thread -1, cpu 2, socket 0, mpidr 80000002
CPU3: thread -1, cpu 3, socket 0, mpidr 80000003
smp: Brought up 1 node, 4 CPUs
SMP: Total of 4 processors activated (1200.00 BogoMIPS).
CPU: All CPU(s) started in HYP mode.
CPU: Virtualization extensions available.
devtmpfs: initialized
random: get_random_u32 called from bucket_table_alloc+0xf8/0x24c with
crng_init=0
VFP support v0.3: implementor 41 architecture 4 part 30 variant f rev 0
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff,
max_idle_ns: 19112604462750000 ns
futex hash table entries: 1024 (order: 5, 131072 bytes)
pinctrl core: initialized pinctrl subsystem
DMI not present or invalid.
NET: Registered protocol family 16
DMA: preallocated 256 KiB pool for atomic coherent allocations
cpuidle: using governor ladder
cpuidle: using governor menu
hw-breakpoint: found 5 (+1 reserved) breakpoint and 4 watchpoint registers.
hw-breakpoint: maximum watchpoint size is 8 bytes.
Serial: AMBA PL011 UART driver
fff36000.serial: ttyAMA0 at MMIO 0xfff36000 (irq = 31, base_baud = 0) is
a PL011 rev3
console [ttyAMA0] enabled
AT91: Could not find identification node
vgaarb: loaded
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
pps_core: LinuxPPS API ver. 1 registered
pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti
<giometti at linux.it>
PTP clock support registered
EDAC MC: Ver: 3.0.0
dmi: Firmware registration failed.
clocksource: Switched to clocksource arch_sys_counter
NET: Registered protocol family 2
TCP established hash table entries: 8192 (order: 3, 32768 bytes)
TCP bind hash table entries: 8192 (order: 4, 65536 bytes)
TCP: Hash tables configured (established 8192 bind 8192)
UDP hash table entries: 512 (order: 2, 16384 bytes)
UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
kvm [1]: 8-bit VMID
kvm [1]: IDMAP page: 401000
kvm [1]: HYP VA range: c0000000:ffffffff
kvm_vgic_hyp_init() returns 0, vgic_present: 0
kvm [1]: == kvm_timer_hyp_init: host_vtimer_irq: 19
kvm [1]: == kvm_timer_hyp_init: host_vtimer_irq_flags: 8
kvm [1]: == kvm_timer_hyp_init: request_percpu_irq() returned 0
kvm [1]: kvm_arch_timer: error setting vcpu affinity: -38
disabling KVM arch hardware on each CPU
 ... done that.
Unable to handle kernel paging request at virtual address 40000000
pgd = (ptrval)
[40000000] *pgd=80000000205003, *pmd=00000000
Internal error: Oops: 206 [#1] SMP ARM
Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Not tainted
4.15.0-rc2-00174-g328b4ed93b69-dirty #614
Hardware name: Highbank
PC is at unmap_hyp_range+0x15c/0x3f8
LR is at free_hyp_pgds+0x168/0x1a0
pc : [<c041a3a8>]    lr : [<c041bda8>]    psr: 80000013
sp : ed11fda0  ip : ed11fe10  fp : ed11fe0c
r10: 40000000  r9 : 00000000  r8 : 00000000
r7 : c19ba980  r6 : 00200000  r5 : 00000000  r4 : 3fffffff
r3 : 40000000  r2 : c19ba980  r1 : 00000000  r0 : 001fffff
Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 30c5387d  Table: 00203000  DAC: 55555555
Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
Stack: (0xed11fda0 to 0xed120000)
fda0: 00000000 c19ba980 c13a5054 40000000 30800000 00000001 307fffff
00000001
fdc0: 40000000 00000000 ed1b7b40 ed1b3000 3fffffff 00000000 40000000
00000000
fde0: c19881f8 40000000 00000000 c19881f8 c19ba990 30800000 00000000
ffffffda
fe00: ed11fe3c ed11fe10 c041bda8 c041a258 40000000 00000000 c18053d0
c177c08c
fe20: 000053d0 0000585c c180585c c180585c ed11fe5c ed11fe40 c0417e4c
c041bc4c
fe40: c18053d0 00000010 c1988e48 c19881e8 ed11fe94 ed11fe60 c0419ee8
c0417e30
fe60: ed11febc 00000000 c1602568 c1987800 c0417e9c 00000000 00000000
00000ee8
fe80: c1987800 00000153 ed11fecc ed11fe98 c040c94c c0419c0c c1987800
c1987840
fea0: ed11febc ed11feb0 c1987800 c0417e9c c1600640 c16dc83c 00000000
c1987800
fec0: ed11fedc ed11fed0 c0417ec4 c040c930 ed11ff4c ed11fee0 c0402470
c0417ea8
fee0: c1600664 c0f5f9c8 c13f4f50 00000153 ed11ff4c ed11ff00 c0489724
c160064c
ff00: c129317c 00000006 00000006 00000000 c13f34d4 c136a0d8 efffca9c
efffcaa1
ff20: 00000000 c1987800 c16dc830 c1987800 c16dc834 c1600640 c16dc83c
00000006
ff40: ed11ff94 ed11ff50 c1600f74 c040242c 00000006 00000006 00000000
c1600640
ff60: 00000001 c177a164 00000000 00000000 c0f65544 00000000 00000000
00000000
ff80: 00000000 00000000 ed11ffac ed11ff98 c0f6555c c1600e2c 00000000
c0f65544
ffa0: 00000000 ed11ffb0 c042be58 c0f65550 00000000 00000000 00000000
00000000
ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000
ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 ed11fff4
00000000
[<c041a3a8>] (unmap_hyp_range) from [<c041bda8>] (free_hyp_pgds+0x168/0x1a0)
[<c041bda8>] (free_hyp_pgds) from [<c0417e4c>] (teardown_hyp_mode+0x28/0x78)
[<c0417e4c>] (teardown_hyp_mode) from [<c0419ee8>]
(kvm_arch_init+0x2e8/0x51c)
[<c0419ee8>] (kvm_arch_init) from [<c040c94c>] (kvm_init+0x28/0x2bc)
[<c040c94c>] (kvm_init) from [<c0417ec4>] (arm_init+0x28/0x2c)
[<c0417ec4>] (arm_init) from [<c0402470>] (do_one_initcall+0x50/0x17c)
[<c0402470>] (do_one_initcall) from [<c1600f74>]
(kernel_init_freeable+0x154/0x1f4)
[<c1600f74>] (kernel_init_freeable) from [<c0f6555c>]
(kernel_init+0x18/0x124)
[<c0f6555c>] (kernel_init) from [<c042be58>] (ret_from_fork+0x14/0x3c)
Code: e1510005 e1a06a86 e2460001 01500004 (e1c300d0)
---[ end trace affbac93bd070906 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

CPU3: stopping
CPU: 3 PID: 0 Comm: swapper/3 Tainted: G      D
4.15.0-rc2-00174-g328b4ed93b69-dirty #614
Hardware name: Highbank
[<c043644c>] (unwind_backtrace) from [<c04304ec>] (show_stack+0x20/0x24)
[<c04304ec>] (show_stack) from [<c0f4f034>] (dump_stack+0x98/0xac)
[<c0f4f034>] (dump_stack) from [<c0433f84>] (handle_IPI+0x2d0/0x350)
[<c0433f84>] (handle_IPI) from [<c0401954>] (gic_handle_irq+0xa8/0xac)
[<c0401954>] (gic_handle_irq) from [<c04310b8>] (__irq_svc+0x58/0x74)
Exception stack(0xed155f28 to 0xed155f70)
5f20:                   00000001 00000000 00000000 c0440ba0 c18052cc
ed154000
5f40: c1805338 00000008 c17847b8 00000000 00000000 ed155f84 ed155f88
ed155f78
5f60: c042c954 c042c958 60000013 ffffffff
[<c04310b8>] (__irq_svc) from [<c042c958>] (arch_cpu_idle+0x48/0x4c)
[<c042c958>] (arch_cpu_idle) from [<c0f6c7f8>] (default_idle_call+0x30/0x3c)
[<c0f6c7f8>] (default_idle_call) from [<c04aa33c>] (do_idle+0x194/0x228)
[<c04aa33c>] (do_idle) from [<c04aa67c>] (cpu_startup_entry+0x28/0x2c)
[<c04aa67c>] (cpu_startup_entry) from [<c0433a68>]
(secondary_start_kernel+0x160/0x16c)
[<c0433a68>] (secondary_start_kernel) from [<004021ec>] (0x4021ec)
CPU2: stopping
CPU: 2 PID: 0 Comm: swapper/2 Tainted: G      D
4.15.0-rc2-00174-g328b4ed93b69-dirty #614
Hardware name: Highbank
[<c043644c>] (unwind_backtrace) from [<c04304ec>] (show_stack+0x20/0x24)
[<c04304ec>] (show_stack) from [<c0f4f034>] (dump_stack+0x98/0xac)
[<c0f4f034>] (dump_stack) from [<c0433f84>] (handle_IPI+0x2d0/0x350)
[<c0433f84>] (handle_IPI) from [<c0401954>] (gic_handle_irq+0xa8/0xac)
[<c0401954>] (gic_handle_irq) from [<c04310b8>] (__irq_svc+0x58/0x74)
Exception stack(0xed153f28 to 0xed153f70)
3f20:                   00000001 00000000 00000000 c0440ba0 c18052cc
ed152000
3f40: c1805338 00000004 c17847b8 00000000 00000000 ed153f84 ed153f88
ed153f78
3f60: c042c954 c042c958 60000013 ffffffff
[<c04310b8>] (__irq_svc) from [<c042c958>] (arch_cpu_idle+0x48/0x4c)
[<c042c958>] (arch_cpu_idle) from [<c0f6c7f8>] (default_idle_call+0x30/0x3c)
[<c0f6c7f8>] (default_idle_call) from [<c04aa33c>] (do_idle+0x194/0x228)
[<c04aa33c>] (do_idle) from [<c04aa67c>] (cpu_startup_entry+0x28/0x2c)
[<c04aa67c>] (cpu_startup_entry) from [<c0433a68>]
(secondary_start_kernel+0x160/0x16c)
[<c0433a68>] (secondary_start_kernel) from [<004021ec>] (0x4021ec)
CPU0: stopping
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G      D
4.15.0-rc2-00174-g328b4ed93b69-dirty #614
Hardware name: Highbank
[<c043644c>] (unwind_backtrace) from [<c04304ec>] (show_stack+0x20/0x24)
[<c04304ec>] (show_stack) from [<c0f4f034>] (dump_stack+0x98/0xac)
[<c0f4f034>] (dump_stack) from [<c0433f84>] (handle_IPI+0x2d0/0x350)
[<c0433f84>] (handle_IPI) from [<c0401954>] (gic_handle_irq+0xa8/0xac)
[<c0401954>] (gic_handle_irq) from [<c04310b8>] (__irq_svc+0x58/0x74)
Exception stack(0xc1801ed8 to 0xc1801f20)
1ec0:                                                       00000001
00000000
1ee0: 00000000 c0440ba0 c18052cc c1800000 c1805338 00000001 c17847b8
00000000
1f00: 00000000 c1801f34 c1801f38 c1801f28 c042c954 c042c958 60000013
ffffffff
[<c04310b8>] (__irq_svc) from [<c042c958>] (arch_cpu_idle+0x48/0x4c)
[<c042c958>] (arch_cpu_idle) from [<c0f6c7f8>] (default_idle_call+0x30/0x3c)
[<c0f6c7f8>] (default_idle_call) from [<c04aa33c>] (do_idle+0x194/0x228)
[<c04aa33c>] (do_idle) from [<c04aa67c>] (cpu_startup_entry+0x28/0x2c)
[<c04aa67c>] (cpu_startup_entry) from [<c0f65540>] (rest_init+0xbc/0xc0)
[<c0f65540>] (rest_init) from [<c1600e14>] (start_kernel+0x3d4/0x3e0)
[<c1600e14>] (start_kernel) from [<00000000>] (  (null))
---[ end Kernel panic - not syncing: Attempted to kill init!
exitcode=0x0000000b



More information about the linux-arm-kernel mailing list