arm64 regression in kernel 5.12 related to the (n)VHE

Rafał Miłecki zajec5 at gmail.com
Thu Aug 12 05:29:56 PDT 2021


On 12.08.2021 12:13, Marc Zyngier wrote:
> On Thu, 12 Aug 2021 09:24:14 +0100,
> Rafał Miłecki <zajec5 at gmail.com> wrote:
>>
>> On 12.08.2021 09:57, Marc Zyngier wrote:
>>> On Thu, 12 Aug 2021 08:32:02 +0100,
>>> Rafał Miłecki <zajec5 at gmail.com> wrote:
>>>>
>>>> On 12.08.2021 08:51, Marc Zyngier wrote:
>>>>> Interestingly, all your CPUs are booting at EL2. Which is great.  Can
>>>>> you try and enable KVM on your existing 5.10 kernel? Just selecting
>>>>> CONFIG_KVM should be enough. Does it boot correctly with KVM enabled?
>>>>>
>>>>> My suspicion is that the firmware doesn't set SCR_EL3.HCE, and that
>>>>> the HVC instruction UNDEFs at EL1. That would be bad news.
>>>>
>>>> Interesting! I had to enable CONFIG_VIRTUALIZATION and CONFIG_NET first.
>>>> First I verified kernel built with those options still boots. It does.
>>>>
>>>> Then I enabled CONFIG_KVM and kernel seems to hang around switching from
>>>> bootconsole to the console.
>>>>
>>>> Starting program at 0x0000000000080000
>>>> /memory = 0x40000000
>>>> WARNING: Node's property /reserved-memory/dt_reserved_buffer is not defined
>>>> WARNING: Node's property /reserved-memory/dt_reserved_flow is not defined
>>>> WARNING: Node's property /reserved-memory/dt_reserved_dhd2 is not defined
>>>> Booting Linux on physical CPU 0x0000000000 [0x420f1000]
>>>> Linux version 5.11.22-g0453a426c37b (rmilecki at localhost.localdomain) (aarch64-buildroot-linux-uclibc-gcc.br_real (Buildroot -g91617ed) 9.3.0, GNU ld (GNU Binutils) 2.33.1) #8 SMP Thu Aug 12 09:25:55 CEST 2021
>>>> Machine model: Asus GT-AC5300
>>>> earlycon: bcm63xx_uart0 at MMIO 0x00000000ff800640 (options '')
>>>> printk: bootconsole [bcm63xx_uart0] enabled
>>>> efi: UEFI not found.
>>>> [Firmware Bug]: Kernel image misaligned at boot, please fix your bootloader!
>>>> Zone ranges:
>>>>     DMA      [mem 0x0000000000000000-0x000000003fffffff]
>>>>     DMA32    empty
>>>>     Normal   empty
>>>> Movable zone start for each node
>>>> Early memory node ranges
>>>>     node   0: [mem 0x0000000000000000-0x000000003fffffff]
>>>> Initmem setup node 0 [mem 0x0000000000000000-0x000000003fffffff]
>>>> percpu: Embedded 18 pages/cpu s43904 r0 d29824 u73728
>>>> Detected VIPT I-cache on CPU0
>>>> CPU features: detected: ARM erratum 843419
>>>> Built 1 zonelists, mobility grouping on.  Total pages: 258048
>>>> Kernel command line: earlycon=bcm63xx_uart,0xff800640
>>>> Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
>>>> Inode-cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
>>>> mem auto-init: stack:off, heap alloc:off, heap free:off
>>>> Memory: 1019556K/1048576K available (4352K kernel code, 678K rwdata, 860K rodata, 2496K init, 232K bss, 29020K reserved, 0K cma-reserved)
>>>> SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
>>>> rcu: Hierarchical RCU implementation.
>>>> rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
>>>> NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
>>>> GIC: Using split EOI/Deactivate mode
>>>> random: get_random_bytes called from start_kernel+0x33c/0x52c with crng_init=0
>>>> arch_timer: cp15 timer(s) running at 50.00MHz (phys).
>>>> clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xb8812736b, max_idle_ns: 440795202655 ns
>>>> sched_clock: 56 bits at 50MHz, resolution 20ns, wraps every 4398046511100ns
>>>> Console: colour dummy device 80x25
>>>> printk: console [tty0] enabled
>>>> printk: bootconsole [bcm63xx_uart0] disabled
>>>>
>>>>
>>>> (Unless it's a false conclusion and CONFIG_KVM just breaks console
>>>> somehow)
>>>
>>> No, that's because you don't pass the right console to your
>>> kernel. Add something like "console=ttyS0,115200" to the kernel
>>> command line, which will show what you are missing, as well as stop
>>> the double-logging.
>>>
>>> Anyway, the fact that it stops booting when you enable KVM confirms my
>>> suspicion. The firmware on this system is probably crap enough not to
>>> enable HVC. Let's confirm it further: please apply the patch below on
>>> top of mainline and tell me that it now boots fine...
>>
>> Thanks for the patch! It workarounds the issue. See below.
>>
>>
>>> Are you in a position where you can actually fix the firmware? Or is
>>> it some closed-source blob?
>>
>> I'm just an end-user with no access to CFE sources and without any
>> business contact as Broadcom :(
> 
> I feared that would be the case. Florian's reply seems to indicate
> that the "upstream" firmware implementation is correct, so the OEM
> must have fumbled it somehow...

Please note that Broadcom has many business units, many teams and from
my understanding they often don't cooperate properly.

It's likely that BCM4908 BU screwed something up. Or maybe it's a matter
of CFE vs. U-Boot?

Florian: does your team (set-top box and cable modem devices) use CFE or
U-Boot with kernels 5.12+?

It's very unlikely it's a single OEM that broke CFE with custom
modifications. This problem affects all 3 devices I own:
1. Netgear R8000P
2. TP-Link Archer C2300 V1
3. Asus GT-AC5300


>>> diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
>>> index 43d212618834..fc95b103ef42 100644
>>> --- a/arch/arm64/kernel/hyp-stub.S
>>> +++ b/arch/arm64/kernel/hyp-stub.S
>>> @@ -238,7 +238,7 @@ SYM_FUNC_START(switch_to_vhe)
>>>      	// Turn the world upside down
>>>    	mov	x0, #HVC_VHE_RESTART
>>> -	hvc	#0
>>> +//	hvc	#0
>>>    1:
>>>    	ret
>>>    SYM_FUNC_END(switch_to_vhe)
>>
>> This allows me to boot 5.13.9 and 5.14-rc5 without any reverts!
>>
>> Enabling CONFIG_KVM still results in the:
>> Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt
> 
> That's expected. Can you please check the patch below? It should
> result in a booting kernel which actually survives having KVM compiled
> in. It should even display a warning telling you that your setup is
> completely buggered.
> 
> That's obviously not the final version, but probably a good enough
> approximation.

It seems to work! Kernel has booted and I saw:
CPU: CPUs started in inconsistent modes
WARNING: CPU: 0 PID: 1 at arch/arm64/kernel/smp.c:426 smp_cpus_done+0x8c/0xc8
(...)
kvm [1]: HYP mode not available


Starting program at 0x0000000000080000
/memory = 0x40000000
WARNING: Node's property /reserved-memory/dt_reserved_buffer is not defined
WARNING: Node's property /reserved-memory/dt_reserved_flow is not defined
WARNING: Node's property /reserved-memory/dt_reserved_dhd2 is not defined
Booting Linux on physical CPU 0x0000000000 [0x420f1000]
Linux version 5.14.0-rc5-g8bad1731c752-dirty (rmilecki at localhost.localdomain) (aarch64-buildroot-linux-uclibc-gcc.br_real (Buildroot -g91617ed) 9.3.0, GNU ld (GNU Binutils) 2.33.1) #23 SMP Thu Aug 12 14:20:52 CEST 2021
Machine model: Asus GT-AC5300
earlycon: bcm63xx_uart0 at MMIO 0x00000000ff800640 (options '')
printk: bootconsole [bcm63xx_uart0] enabled
efi: UEFI not found.
[Firmware Bug]: Kernel image misaligned at boot, please fix your bootloader!
Zone ranges:
   DMA      [mem 0x0000000000000000-0x000000003fffffff]
   DMA32    empty
   Normal   empty
Movable zone start for each node
Early memory node ranges
   node   0: [mem 0x0000000000000000-0x000000003fffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x000000003fffffff]
percpu: Embedded 18 pages/cpu s44568 r0 d29160 u73728
Detected VIPT I-cache on CPU0
CPU features: detected: ARM erratum 843419
Built 1 zonelists, mobility grouping on.  Total pages: 258048
Kernel command line: earlycon=bcm63xx_uart,0xff800640 console=ttyS0,115200
Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
Inode-cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
mem auto-init: stack:off, heap alloc:off, heap free:off
Memory: 1017496K/1048576K available (5824K kernel code, 762K rwdata, 1100K rodata, 2624K init, 259K bss, 31080K reserved, 0K cma-reserved)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
rcu: Hierarchical RCU implementation.
rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
Root IRQ handler: gic_handle_irq
random: get_random_bytes called from start_kernel+0x4a0/0x6dc with crng_init=0
arch_timer: cp15 timer(s) running at 50.00MHz (virt).
clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xb8812736b, max_idle_ns: 440795202655 ns
sched_clock: 56 bits at 50MHz, resolution 20ns, wraps every 4398046511100ns
Console: colour dummy device 80x25
Calibrating delay loop (skipped), value calculated using timer frequency.. 100.00 BogoMIPS (lpj=200000)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 2048 (order: 2, 16384 bytes, linear)
Mountpoint-cache hash table entries: 2048 (order: 2, 16384 bytes, linear)
rcu: Hierarchical SRCU implementation.
EFI services will not be available.
smp: Bringing up secondary CPUs ...
Detected VIPT I-cache on CPU1
CPU1: Booted secondary processor 0x0000000001 [0x420f1000]
Detected VIPT I-cache on CPU2
CPU2: Booted secondary processor 0x0000000002 [0x420f1000]
Detected VIPT I-cache on CPU3
CPU3: Booted secondary processor 0x0000000003 [0x420f1000]
smp: Brought up 1 node, 4 CPUs
SMP: Total of 4 processors activated.
CPU features: detected: 32-bit EL0 Support
CPU features: detected: 32-bit EL1 Support
CPU features: detected: CRC32 instructions
------------[ cut here ]------------
CPU: CPUs started in inconsistent modes
WARNING: CPU: 0 PID: 1 at arch/arm64/kernel/smp.c:426 smp_cpus_done+0x8c/0xc8
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc5-g8bad1731c752-dirty #23
Hardware name: Asus GT-AC5300 (DT)
pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
pc : smp_cpus_done+0x8c/0xc8
lr : smp_cpus_done+0x8c/0xc8
sp : ffffffc01002be00
x29: ffffffc01002be00 x28: 0000000000000000 x27: 0000000000000000
x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
x23: ffffffc010ab4000 x22: 0000000000000000 x21: 0000000000000000
x20: ffffffc0107b7e74 x19: ffffffc010a78000 x18: 0000000000000001
x17: ffffffc010a9ee40 x16: 0000000000000000 x15: 0000424c5a953180
x14: fffffffffffc0ef7 x13: 0000000000000037 x12: ffffff80010b03b0
x11: 00000000ffffffea x10: ffffffc010a5eb50 x9 : 0000000000000001
x8 : 0000000000000001 x7 : 0000000000017fe8 x6 : c0000000ffffefff
x5 : 0000000000057fa8 x4 : 0000000000000000 x3 : 0000000000000000
x2 : 00000000ffffffff x1 : 33e2e90440df2000 x0 : 0000000000000000
Call trace:
  smp_cpus_done+0x8c/0xc8
  smp_init+0x68/0x78
  kernel_init_freeable+0xd0/0x214
  kernel_init+0x24/0x120
  ret_from_fork+0x10/0x18
---[ end trace 773cbee471955c5a ]---
alternatives: patching kernel code
devtmpfs: initialized
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
futex hash table entries: 1024 (order: 4, 65536 bytes, linear)
pinctrl core: initialized pinctrl subsystem
DMI not present or invalid.
NET: Registered PF_NETLINK/PF_ROUTE protocol family
DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations
DMA: preallocated 128 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations
DMA: preallocated 128 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
thermal_sys: Registered thermal governor 'step_wise'
ASID allocator initialised with 65536 entries
iommu: Default domain type: Translated
vgaarb: loaded
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
clocksource: Switched to clocksource arch_sys_counter
NET: Registered PF_INET protocol family
IP idents hash table entries: 16384 (order: 5, 131072 bytes, linear)
tcp_listen_portaddr_hash hash table entries: 512 (order: 1, 8192 bytes, linear)
TCP established hash table entries: 8192 (order: 4, 65536 bytes, linear)
TCP bind hash table entries: 8192 (order: 5, 131072 bytes, linear)
TCP: Hash tables configured (established 8192 bind 8192)
UDP hash table entries: 512 (order: 2, 16384 bytes, linear)
UDP-Lite hash table entries: 512 (order: 2, 16384 bytes, linear)
NET: Registered PF_UNIX/PF_LOCAL protocol family
PCI: CLS 0 bytes, default 64
kvm [1]: HYP mode not available
workingset: timestamp_bits=62 max_order=18 bucket_order=0
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
io scheduler mq-deadline registered
io scheduler kyber registered
ɥ饹�ѭ� console [ttyS0] enabled 0xff800640 (irq = 24, base_baud = 1562500) is a bcm63xx_uart
printk: console [ttyS0] enabled
printk: bootconsole [bcm63xx_uart0] disabled
printk: bootconsole [bcm63xx_uart0] disabled
nand: device found, Manufacturer ID: 0xc8, Chip ID: 0xda
nand: ESMT NAND 256MiB 3,3V 8-bit
nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
bcm63138_nand ff801800.nand: detected 256MiB total, 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
Bad block table found at page 131008, version 0x01
Bad block table found at page 130944, version 0x01
3 fixed-partitions partitions found on MTD device brcmnand.0
Creating 3 MTD partitions on "brcmnand.0":
0x000000000000-0x000000100000 : "cferom"
0x000000100000-0x000005800000 : "firmware"
0x000005800000-0x00000af00000 : "backup"
libphy: Fixed MDIO Bus: probed
libphy: unimac MII bus: probed
unimac-mdio 800c05c0.mdio: Broadcom UniMAC MDIO bus
libphy: sf2 slave mii: probed
brcm-sf2 80080000.ethernet-switch: found switch: BCM4908, rev 0
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci-pci: EHCI PCI platform driver
ehci-platform: EHCI generic platform driver
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
ohci-pci: OHCI PCI platform driver
ohci-platform: OHCI generic platform driver
i2c /dev entries driver
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
NET: Registered PF_INET6 protocol family
Segment Routing with IPv6
sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
NET: Registered PF_PACKET protocol family
8021q: 802.1Q VLAN Support v1.8
brcmstb-usb-phy 8000c200.usb-phy: Clock not found in Device Tree
brcmstb-usb-phy 8000c200.usb-phy: USB3.0 clock not found in Device Tree
brcmstb-usb-phy 8000c200.usb-phy: Suspend Clock not found in Device Tree
brcmstb-usb-phy 8000c200.usb-phy: IRQ wake not found
brcmstb-usb-phy 8000c200.usb-phy: IRQ wakeup not found
brcmstb-usb-phy 8000c200.usb-phy: Wake interrupt missing, system wake not supported
libphy: sf2 slave mii: probed
brcm-sf2 80080000.ethernet-switch: found switch: BCM4908, rev 0
brcm-sf2 80080000.ethernet-switch lan2 (uninitialized): PHY [800c05c0.mdio--1:08] driver [Generic PHY] (irq=POLL)
brcm-sf2 80080000.ethernet-switch lan1 (uninitialized): PHY [800c05c0.mdio--1:09] driver [Generic PHY] (irq=POLL)
brcm-sf2 80080000.ethernet-switch lan6 (uninitialized): PHY [800c05c0.mdio--1:0a] driver [Generic PHY] (irq=POLL)
brcm-sf2 80080000.ethernet-switch lan5 (uninitialized): PHY [800c05c0.mdio--1:0b] driver [Generic PHY] (irq=POLL)
brcm-sf2 80080000.ethernet-switch: configuring for fixed/internal link mode
eth0: mtu greater than device maximum
bcm4908_enet 80002000.ethernet eth0: error -22 setting MTU to 1504 to include DSA overhead
DSA: tree 0 setup
brcm-sf2 80080000.ethernet-switch: Starfighter 2 top: 4.07, core: 5.00, IRQs: 22, 23
brcm-sf2 80080000.ethernet-switch: Link is Up - 1Gbps/Full - flow control off
ehci-platform 8000c300.usb: EHCI Host Controller
ehci-platform 8000c300.usb: new USB bus registered, assigned bus number 1
ehci-platform 8000c300.usb: irq 19, io mem 0x8000c300
ehci-platform 8000c300.usb: USB 2.0 started, EHCI 1.00
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
ohci-platform 8000c400.usb: Generic Platform OHCI controller
ohci-platform 8000c400.usb: new USB bus registered, assigned bus number 2
ohci-platform 8000c400.usb: irq 20, io mem 0x8000c400
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
xhci-hcd 8000d000.usb: xHCI Host Controller
xhci-hcd 8000d000.usb: new USB bus registered, assigned bus number 3
xhci-hcd 8000d000.usb: hcc params 0x0250f17c hci version 0x100 quirks 0x0000000000010010
xhci-hcd 8000d000.usb: irq 21, io mem 0x8000d000
hub 3-0:1.0: USB hub found
hub 3-0:1.0: config failed, hub doesn't have any ports! (err -19)
xhci-hcd 8000d000.usb: xHCI Host Controller
xhci-hcd 8000d000.usb: new USB bus registered, assigned bus number 4
xhci-hcd 8000d000.usb: Host supports USB 3.0 SuperSpeed
usb usb4: We don't know the algorithms for LPM for this host, disabling LPM.
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
Freeing unused kernel memory: 2624K
Run /init as init process
tmpfs: Unknown parameter 'mode'
mount: mounting tmpfs: Unknown parameter 'mode'
tmpfs on /dev/shtmpfs: Unknown parameter 'mode'
m failed: Invalid argument
mount: mounting tmpfs on /tmp failed: Invalid argument
mount: mounting tmpfs on /run failed: Invalid argument
Starting syslogd: OK
Starting klogd: OK
Running sysctl: OK
Savrandom: dd: uninitialized urandom read (512 bytes read)
ing random seed: OK
Starting network: brcm-sf2 80080000.ethernet-switch lan1: configuring for phy/internal link mode
8021q: adding VLAN 0 to HW filter on device lan1
brcm-sf2 80080000.ethernet-switch lan2: configuring for phy/internal link mode
8021q: adding VLAN 0 to HW filter on device lan2
br-lan: port 1(lan1) entered blocking state
br-lan: port 1(lan1) entered disabled state
device lan1 entered promiscuous mode
device eth0 entered promiscuous mode
br-lan: port 2(lan2) entered blocking state
br-lan: port 2(lan2) entered disabled state
device lan2 entered promiscuous mode
OK

Welcome to Buildroot
buildroot login: brcm-sf2 80080000.ethernet-switch lan1: Link is Up - 1Gbps/Full - flow control rx/tx
IPv6: ADDRCONF(NETDEV_CHANGE): lan1: link becomes ready
br-lan: port 1(lan1) entered blocking state
br-lan: port 1(lan1) entered forwarding state
IPv6: ADDRCONF(NETDEV_CHANGE): br-lan: link becomes ready

Welcome to Buildroot
buildroot login:



More information about the linux-arm-kernel mailing list