[BUG] ARM64 regression: NULL pointer dereference in arm_smccc_version_init+0x90/0x1ac

Mark Rutland mark.rutland at arm.com
Thu Jan 30 04:19:55 PST 2025


On Fri, Jan 24, 2025 at 03:52:10PM +0100, Emanuele Rocca wrote:
> Hi,

Hi Emanuele,

> longterm kernel 6.1.123 crashes early when booting on the Lenovo Thinkpad X13s
> with the following error:
> 
>  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000264
> 
>  pc: arm_smccc_version_init+0x90/0x1ac
> 
> According to faddr2line, that is line 31 of smccc.c:
> 
>  arm_smccc_version_init+0x90/0x1ac:
>  arm_smccc_version_init at debian/build/build_arm64_none_arm64/drivers/firmware/smccc/smccc.c:31
> 
>  22 void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit)
>  23 {
>  24         struct arm_smccc_res res;
>  25 
>  26         smccc_version = version;
>  27         smccc_conduit = conduit;
>  28 
>  29         smccc_trng_available = smccc_probe_trng();
>  30 
>  31         if ((smccc_version >= ARM_SMCCC_VERSION_1_2) &&

For the benefit others, when we looked into this a few days ago it
appeared that a GPR was being clobbered across an SMCCC call, resulting
in a later crash (as that GPR should hold the ADRP'd base address of
'smccc_version'). I didn't have the time to dig more into that (e.g.  to
figure out whether kernel/compiler/firmware was to blame).

Emanuele, could you please dump the result of:

  objdump --disassemble=arm_smccc_version_init vmlinux

... for this kernel? That'd make it possible for others to
perform/verify the analysis I mentioned above.

If you can share any more details from the crash, that'd be helpful. The
GPR dump would be *enormously* helpful in this case, and even a photo of
the crash log might be useful.

> This is with kernel 6.1.123. The last known good kernel I have available right
> now is 6.1.119. In the 6.1.120 changelog I see the following commit which seems
> potentially related?
> 
>  https://lore.kernel.org/all/20241106160448.2712997-1-mark.rutland@arm.com/

Last I looked, there was no obvious reason why that should have an
effect on this issue. It's possible that the differing asm constraints
have an effect on code generation, and happen to mask the issue.

>From a quick scan, I note that the asm constraints *don't* include
clobber x17, and maybe that's getting clobbered by a veneer between the
BL and __arm_smccc_sve_check().

As above, it would really help to have the disassembly for
arm_smccc_version_init(), and the GPRs at the time of the crash.

Mark.

> 
> That's stable commit [1].
> 
> The relevant upstream commit [2] is in linux 6.12, and that kernel version does
> not crash. Comparing [1] vs [2] I see differences, but I can't tell if they can
> help debug the issue further. 
> 
> Thanks,
>   ema
> 
> [1] https://git.kernel.org/linus/bfcaffd4cc2d61ecb0571c5baf127c4089978ad4
> [2] https://git.kernel.org/linus/8c462d56487e3abdbf8a61cedfe7c795a54f4a78



More information about the linux-arm-kernel mailing list