[BUG] drivers/perf: hisi: kernel fails to boot since commit ac4511c9364c "drivers/perf: hisi: Add identifier sysfs file"

Shaokun Zhang zhangshaokun at hisilicon.com
Tue Jan 25 18:24:21 PST 2022


Hi Wang Cheng,

On 2022/1/25 22:35, Wang Cheng wrote:
> Hi respected kernel committers,
> 
> After upgrading from v5.10 to 5.17.0-rc1, my kernel fails to boot. By
> bisectionally compiling and installing kernel a few times, the boot
> problem first appears when upgrading from v5.10 to v5.11-rc1. Here are
> boot messages of v5.11-rc1:
> [2.126166] workingset: timestamp_bits=42 max_order=21 bucket_order=0
> [2.127807] zbud: loaded
> [2.128010] integrity: Platform Keyring initialized
> [2.128014] Key type asymmetric registered
> [2.128017] Asymmetric key parser 'x509' registered
> [2.128026] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 249)
> [2.128092] io scheduler mq-deadline registered
> [2.130976] pcieport 0000:00:01.0: can't derive routing for PCI INT A
> [2.130982] pcieport 0000:00:01.0: PCI INT A: no GSI
> [2.131272] pcieport 0000:00:01.0: can't derive routing for PCI INT A
> [2.131276] pcieport 0000:00:01.0: PCI INT A: no GSI
> [2.132188] ACPI: IORT: [Firmware Bug]: [map (____ptrval____)] conflicting mapping for input ID 0x7c00
> [2.132192] ACPI: IORT: [Firmware Bug]: applying workaround.
> [2.132259] ACPI: IORT: [Firmware Bug]: [map (____ptrval____)] conflicting mapping for input ID 0x7c00
> [2.132263] ACPI: IORT: [Firmware Bug]: applying workaround.
> [2.132285] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
> [2.132850] efifb: probing for efifb
> [2.132875] efifb: framebuffer at 0x80000000000, using 3072k, total 3072k
> [2.132878] efifb: mode is 1024x768x32, linelength=4096, pages=1
> [2.132881] efifb: scrolling: redraw
> [2.132883] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0
> [2.135097] Console: switching to colour frame buffer device 128x48
> [2.137221] fb0: EFI VGA frame buffer device
> [2.137655] input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input0
> [2.137715] ACPI: Power Button [PWRB]
> [2.138153] ACPI GTDT: found 1 SBSA generic Watchdog(s).
> [2.140587] Serial: 8250/16550 deiver, 4 ports, IRQ sharing enabled
> [2.141685] Serial: AMBA driver
> [2.141828] msm_serial: driver initialized
> [2.142785] mousedev: PS/2 mouse device common for all mice
> [2.186108] rtc-efi rtc-efi.0: registered as rtc0
> [2.198458] rtc-efi rtc-efi.0: setting system clock to 2022-01-23T13:49:49 UTC(1642945789)
> [2.199157] ledtrig-cpu: registered to indicate activity on CPUs
> _
> Boot messages stop here. After about 2 seconds, the cpu fan runs in the
> fastest mode all the time. I have to press the power button to turn the
> machine off.
> 
> By bisectionally test, I found commit ac4511c9364c9a6390e8585cdd4596103bca16eb
> "drivers/perf: hisi: Add identifier sysfs file" is suspect of resulting
> the boot problem. If I revert this commit on v5.11-rc1, kernel boots
> fine. That is:
> $ git checkout v5.11-rc1
> $ git revert ac4511c9364c9a6390e8585cdd4596103bca16eb
> $ "compile and install kernel"
> kernel boots fine.
> 
> I tested a bit more. I think hisi_uncore_ddrc_pmu.c or function
> hisi_uncore_pmu_identifier_attr_show(...) is suspect in commit ac4511c9364c.

The DDRC nodes enumerated in the UEFI are more than the SoC chip has, so
the superfluous node will be suspended when the PMU driver is initializated.

> Commit ac4511c9364c modified 5 files:
> 	drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c
> 	drivers/perf/hisilicon/hisi_uncore_hha_pmu.c
> 	drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
> 	drivers/perf/hisilicon/hisi_uncore_pmu.c
> 	drivers/perf/hisilicon/hisi_uncore_pmu.h
> As I understand, a new function hisi_uncore_pmu_identifier_attr_show(...)
> is declared and defined in hisi_uncore_pmu.h and hisi_uncore_pmu.c, the
> new function is called in hisi_uncore_ddrc_pmu.c, hisi_uncore_hha_pmu.c
> and hisi_uncore_l3c_pmu.c. I tested that, if I restore the change in
> hisi_uncore_ddrc_pmu.c, kernel boots fine; if I restore the change in
> hisi_uncore_hha_pmu.c and hisi_uncore_l3c_pmu.c, kernel fails to boot.
> That is:
> 	drivers/perf/hisilicon/hisi_uncore_hha_pmu.c
> 	drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
> 	drivers/perf/hisilicon/hisi_uncore_pmu.c
> 	drivers/perf/hisilicon/hisi_uncore_pmu.h
> compile and install, kernel boots fine.
> 

Without the commit ac4511c9364 ("drivers/perf: hisi: Add identifier sysfs file"),
the driver can be loaded successfully that driver doesn't access any physical
register, but it will be suspended if you use the all DDRC nodes perf.

> 	drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c
> 	drivers/perf/hisilicon/hisi_uncore_pmu.c
> 	drivers/perf/hisilicon/hisi_uncore_pmu.h
> compile and install, kernel fails to boot.
> 
> $ sudo lshw
> cpu version: HUAWEI Kunpeng 920 2251K
> cores=8 enabledcores=8 threads=8

As John said that some guys has reported the similiar issue on this
chip.

> 8Gib RAM
> $ uname -a
> Linux armdebian 5.10.0-9-arm64 #1 SMP Debian 5.10.70-1 (2021-09-30)
> aarch64 GNU/Linux
> 
> Do I understand right? What else I could do to solve this?

I'm checking with the BIOS guy and which version they fix the issue.
I will give the right BIOS and can you help to try it?

Thanks,
Shaokun

> 
> Regards,
> 
> --
> Cheng
> .
> 



More information about the linux-arm-kernel mailing list