arm64 regression in kernel 5.12 related to the (n)VHE

Rafał Miłecki zajec5 at gmail.com
Wed Aug 11 09:55:07 PDT 2021


On 11.08.2021 14:50, Marc Zyngier wrote:
> On Wed, 11 Aug 2021 13:15:31 +0100,
> Rafał Miłecki <zajec5 at gmail.com> wrote:
>>
>> Hi,
>>
>> I just tried upgrading from the old good LTS kernel 5.10 and I
>> discovered that my bcm4908 boards don't boot anymore with the 5.14-rc5.
>>
>>
>> The problem is kernel doesn't seem to start booting at all. I see CFE
>> bootloader messages:
>>
>> Starting program at 0x0000000000080000
>> /memory = 0x40000000
>>
>> and then nothing. Normally the first kernel line should follow like a:
>> Linux version 5.11.0-rc4 (rmilecki at localhost.localdomain) (aarch64-buildroot-linux-uclibc-gcc.br_real (Buildroot -g91617ed) 9.3.0, GNU ld (GNU Binutils) 2.33.1) #30 SMP Wed Aug 11 14:01:00 CEST 2021
>>
>>
>> I have zero knowledge of low level arm64 or assembler stuff. I also
>> don't own any bcm4908 development board or bcm4908 datasheets.
>>
>> All I could do to help debugging this regression was bisecting. The
>> first bad commit (I verified it after bisecting process) is:
>>
>> commit 0c93df9622d4d921bcd0dc83f71fed9e98f5119f
>> Author: Marc Zyngier <maz at kernel.org>
>> Date:   Mon Feb 8 09:57:14 2021 +0000
>>
>>      arm64: Initialise as nVHE before switching to VHE
>>
>>      As we are aiming to be able to control whether we enable VHE or
>>      not, let's always drop down to EL1 first, and only then upgrade
>>      to VHE if at all possible.
>>
>>      This means that if the kernel is booted at EL2, we always start
>>      with a nVHE init, drop to EL1 to initialise the the kernel, and
>>      only then upgrade the kernel EL to EL2 if possible (the process
>>      is obviously shortened for secondary CPUs).
>>
>>      The resume path is handled similarly to a secondary CPU boot.
>>
>>      Signed-off-by: Marc Zyngier <maz at kernel.org>
>>      Acked-by: David Brazdil <dbrazdil at google.com>
>>      Acked-by: Catalin Marinas <catalin.marinas at arm.com>
>>      Link: https://lore.kernel.org/r/20210208095732.3267263-6-maz@kernel.org
>>      [will: Avoid calling switch_to_vhe twice on kaslr path]
>>      Signed-off-by: Will Deacon <will at kernel.org>
>>
>>
>> Could you look at this issue, please? I'm happy to test any patches or
>> provide any extra info I can obtain using kernel 5.11.
>>
>>
>> My defconfig for bcm4908 is:
> 
> [...]
> 
> I don't think the dconfig is that relevant (nothing you quote here
> would have an impact that early in the boot process).
> 
> On the other hand, a description of the platform (what CPUs does it
> have) and how it boots (VHE, non-VHE, booted at EL2 or not) would be
> extremely useful. At minimum, a boot log of a working kernel could
> help.

Thank you for your patience & reply.

BCM4908 is Broadcom's 64-bit platform with Broadcom's own Brahma-B53
CPU(s). I don't know how it boots. Is that something I can find out
from a running system?

For DTS SoC description you can check:
arch/arm64/boot/dts/broadcom/bcm4908/bcm4908.dtsi

See below for bootlog and /proc/cpuinfo. Please note I seem to have
console misconfigured and early part of log appears twice (nothing
really harmful).

Starting program at 0x0000000000080000
/memory = 0x40000000
WARNING: Node's property /reserved-memory/dt_reserved_buffer is not defined
WARNING: Node's property /reserved-memory/dt_reserved_flow is not defined
WARNING: Node's property /reserved-memory/dt_reserved_dhd2 is not defined
Booting Linux on physical CPU 0x0000000000 [0x420f1000]
Linux version 5.11.22-g40462c7f0649 (rmilecki at localhost.localdomain) (aarch64-buildroot-linux-uclibc-gcc.br_real (Buildroot -g91617ed) 9.3.0, GNU ld (GNU Binutils) 2.33.1) #9 SMP Wed Aug 11 18:39:58 CEST 2021
Machine model: Asus GT-AC5300
earlycon: bcm63xx_uart0 at MMIO 0x00000000ff800640 (options '')
printk: bootconsole [bcm63xx_uart0] enabled
efi: UEFI not found.
[Firmware Bug]: Kernel image misaligned at boot, please fix your bootloader!
Zone ranges:
   DMA      [mem 0x0000000000000000-0x000000003fffffff]
   DMA32    empty
   Normal   empty
Movable zone start for each node
Early memory node ranges
   node   0: [mem 0x0000000000000000-0x000000003fffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x000000003fffffff]
percpu: Embedded 17 pages/cpu s37856 r0 d31776 u69632
Detected VIPT I-cache on CPU0
CPU features: detected: ARM erratum 843419
Built 1 zonelists, mobility grouping on.  Total pages: 258048
Kernel command line: earlycon=bcm63xx_uart,0xff800640
Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
Inode-cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
mem auto-init: stack:off, heap alloc:off, heap free:off
Memory: 1020660K/1048576K available (3584K kernel code, 650K rwdata, 684K rodata, 2368K init, 229K bss, 27916K reserved, 0K cma-reserved)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
rcu: Hierarchical RCU implementation.
rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
GIC: Using split EOI/Deactivate mode
random: get_random_bytes called from start_kernel+0x33c/0x524 with crng_init=0
arch_timer: cp15 timer(s) running at 50.00MHz (phys).
clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xb8812736b, max_idle_ns: 440795202655 ns
sched_clock: 56 bits at 50MHz, resolution 20ns, wraps every 4398046511100ns
Console: colour dummy device 80x25
printk: console [tty0] enabled
printk: bootconsole [bcm63xx_uart0] disabled
Booting Linux on physical CPU 0x0000000000 [0x420f1000]
Linux version 5.11.22-g40462c7f0649 (rmilecki at localhost.localdomain) (aarch64-buildroot-linux-uclibc-gcc.br_real (Buildroot -g91617ed) 9.3.0, GNU ld (GNU Binutils) 2.33.1) #9 SMP Wed Aug 11 18:39:58 CEST 2021
Machine model: Asus GT-AC5300
earlycon: bcm63xx_uart0 at MMIO 0x00000000ff800640 (options '')
printk: bootconsole [bcm63xx_uart0] enabled
efi: UEFI not found.
[Firmware Bug]: Kernel image misaligned at boot, please fix your bootloader!
Zone ranges:
   DMA      [mem 0x0000000000000000-0x000000003fffffff]
   DMA32    empty
   Normal   empty
Movable zone start for each node
Early memory node ranges
   node   0: [mem 0x0000000000000000-0x000000003fffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x000000003fffffff]
percpu: Embedded 17 pages/cpu s37856 r0 d31776 u69632
Detected VIPT I-cache on CPU0
CPU features: detected: ARM erratum 843419
Built 1 zonelists, mobility grouping on.  Total pages: 258048
Kernel command line: earlycon=bcm63xx_uart,0xff800640
Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
Inode-cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
mem auto-init: stack:off, heap alloc:off, heap free:off
Memory: 1020660K/1048576K available (3584K kernel code, 650K rwdata, 684K rodata, 2368K init, 229K bss, 27916K reserved, 0K cma-reserved)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
rcu: Hierarchical RCU implementation.
rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
GIC: Using split EOI/Deactivate mode
random: get_random_bytes called from start_kernel+0x33c/0x524 with crng_init=0
arch_timer: cp15 timer(s) running at 50.00MHz (phys).
clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xb8812736b, max_idle_ns: 440795202655 ns
sched_clock: 56 bits at 50MHz, resolution 20ns, wraps every 4398046511100ns
Console: colour dummy device 80x25
printk: console [tty0] enabled
printk: bootconsole [bcm63xx_uart0] disabled
Calibrating delay loop (skipped), value calculated using timer frequency.. 100.00 BogoMIPS (lpj=200000)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 2048 (order: 2, 16384 bytes, linear)
Mountpoint-cache hash table entries: 2048 (order: 2, 16384 bytes, linear)
rcu: Hierarchical SRCU implementation.
EFI services will not be available.
smp: Bringing up secondary CPUs ...
Detected VIPT I-cache on CPU1
CPU1: Booted secondary processor 0x0000000001 [0x420f1000]
Detected VIPT I-cache on CPU2
CPU2: Booted secondary processor 0x0000000002 [0x420f1000]
Detected VIPT I-cache on CPU3
CPU3: Booted secondary processor 0x0000000003 [0x420f1000]
smp: Brought up 1 node, 4 CPUs
SMP: Total of 4 processors activated.
CPU features: detected: 32-bit EL0 Support
CPU features: detected: CRC32 instructions
CPU: All CPU(s) started at EL2
alternatives: patching kernel code
devtmpfs: initialized
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
futex hash table entries: 1024 (order: 4, 65536 bytes, linear)
pinctrl core: initialized pinctrl subsystem
DMI not present or invalid.
DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations
DMA: preallocated 128 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations
DMA: preallocated 128 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
thermal_sys: Registered thermal governor 'step_wise'
ASID allocator initialised with 65536 entries
iommu: Default domain type: Translated
vgaarb: loaded
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
clocksource: Switched to clocksource arch_sys_counter
PCI: CLS 0 bytes, default 64
workingset: timestamp_bits=62 max_order=18 bucket_order=0
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
io scheduler mq-deadline registered
io scheduler kyber registered
basic-mmio-gpio: probe of ff800500.gpio-controller failed with error -22
ff800640.serial: ttyS0 at MMIO 0xff800640 (irq = 17, base_baud = 1562500) is a bcm63xx_uart
printk: console [ttyS0] enabled
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci-pci: EHCI PCI platform driver
ehci-platform: EHCI generic platform driver
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
ohci-pci: OHCI PCI platform driver
ohci-platform: OHCI generic platform driver
i2c /dev entries driver
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
Freeing unused kernel memory: 2368K
Run /init as init process
tmpfs: Unknown parameter 'mode'
mount: mounting tmpfs: Unknown parameter 'mode'
tmpfs on /dev/shtmpfs: Unknown parameter 'mode'
m failed: Invalid argument
mount: mounting tmpfs on /tmp failed: Invalid argument
mount: mounting tmpfs on /run failed: Invalid argument
Starting syslogd: OK
Starting klogd: random: dd: uninitialized urandom read (512 bytes read)
OK
Running sysctl: OK
Saving random seed: OK
Starting network: ip: socket: Function not implemented
ip: socket: Function not implemented
ip: socket: Function not implemented
ip: socket: Function not implemented
ip: socket: Function not implemented
ip: socket: Function not implemented
ip: socket: Function not implemented
ip: socket: Function not implemented
ip: socket: Function not implemented
FAIL

Welcome to Buildroot
buildroot login:

# cat /proc/cpuinfo
processor       : 0
BogoMIPS        : 100.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x42
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0x100
CPU revision    : 0

processor       : 1
BogoMIPS        : 100.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x42
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0x100
CPU revision    : 0

processor       : 2
BogoMrandom: fast init done
IPS     : 100.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x42
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0x100
CPU revision    : 0

processor       : 3
BogoMIPS        : 100.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x42
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0x100
CPU revision    : 0



More information about the linux-arm-kernel mailing list