arm64 regression in kernel 5.12 related to the (n)VHE

Marc Zyngier maz at kernel.org
Thu Aug 12 05:57:46 PDT 2021


On Thu, 12 Aug 2021 13:29:56 +0100,
Rafał Miłecki <zajec5 at gmail.com> wrote:
> 
> On 12.08.2021 12:13, Marc Zyngier wrote:
> > On Thu, 12 Aug 2021 09:24:14 +0100,
> > Rafał Miłecki <zajec5 at gmail.com> wrote:

[...]

> >> I'm just an end-user with no access to CFE sources and without any
> >> business contact as Broadcom :(
> > 
> > I feared that would be the case. Florian's reply seems to indicate
> > that the "upstream" firmware implementation is correct, so the OEM
> > must have fumbled it somehow...
> 
> Please note that Broadcom has many business units, many teams and from
> my understanding they often don't cooperate properly.

I bet some team sampled an early version of the firmware that included
the bug and never looked back. You can also tell the level of quality
by the fact that it uses spin-tables to boot, that the interrupt
controller node is incomplete...

> It's likely that BCM4908 BU screwed something up. Or maybe it's a matter
> of CFE vs. U-Boot?

It is a matter of whatever is running at EL3 and doing the basic setup
of the CPUs.
> 
> Florian: does your team (set-top box and cable modem devices) use CFE or
> U-Boot with kernels 5.12+?
> 
> It's very unlikely it's a single OEM that broke CFE with custom
> modifications. This problem affects all 3 devices I own:
> 1. Netgear R8000P
> 2. TP-Link Archer C2300 V1
> 3. Asus GT-AC5300

They probably all use the same pre-cast design with some sort of
value-add on top.

[...]

> > That's expected. Can you please check the patch below? It should
> > result in a booting kernel which actually survives having KVM compiled
> > in. It should even display a warning telling you that your setup is
> > completely buggered.
> > 
> > That's obviously not the final version, but probably a good enough
> > approximation.
> 
> It seems to work! Kernel has booted and I saw:
> CPU: CPUs started in inconsistent modes
> WARNING: CPU: 0 PID: 1 at arch/arm64/kernel/smp.c:426 smp_cpus_done+0x8c/0xc8
> (...)
> kvm [1]: HYP mode not available

Right. So there is some hope. Maybe. I'm not sure I want to maintain
this crap though.

[...]

> nand: device found, Manufacturer ID: 0xc8, Chip ID: 0xda
> nand: ESMT NAND 256MiB 3,3V 8-bit
> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
> bcm63138_nand ff801800.nand: detected 256MiB total, 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> Bad block table found at page 131008, version 0x01
> Bad block table found at page 130944, version 0x01
> 3 fixed-partitions partitions found on MTD device brcmnand.0
> Creating 3 MTD partitions on "brcmnand.0":
> 0x000000000000-0x000000100000 : "cferom"
> 0x000000100000-0x000005800000 : "firmware"
> 0x000005800000-0x00000af00000 : "backup"

So here's your chance! You have the firmware image here (I guess
"cferom" is the one). It'd be interesting to disassemble it, find out
where SCR_EL3 is set, patch it and never look back.

Only kidding.

	M.

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list