[PATCHv8 1/5] powerpc/setup : Enable boot_cpu_hwid for PPC32

Sourabh Jain sourabhjain at linux.ibm.com
Mon Oct 9 21:44:01 PDT 2023


Hello Pingfan,

With this patch series applied, the kdump kernel fails to boot on 
powerpc with nr_cpus=1.

Console logs:
-------------------
[root]# echo c > /proc/sysrq-trigger
[   74.783235] sysrq: Trigger a crash
[   74.783244] Kernel panic - not syncing: sysrq triggered crash
[   74.783252] CPU: 58 PID: 3838 Comm: bash Kdump: loaded Not tainted 
6.6.0-rc5pf-nr-cpus+ #3
[   74.783259] Hardware name: POWER10 (raw) phyp pSeries
[   74.783275] Call Trace:
[   74.783280] [c00000020f4ebac0] [c000000000ed9f38] 
dump_stack_lvl+0x6c/0x9c (unreliable)
[   74.783291] [c00000020f4ebaf0] [c000000000150300] panic+0x178/0x438
[   74.783298] [c00000020f4ebb90] [c000000000936d48] 
sysrq_handle_crash+0x28/0x30
[   74.783304] [c00000020f4ebbf0] [c00000000093773c] 
__handle_sysrq+0x10c/0x250
[   74.783309] [c00000020f4ebc90] [c000000000937fa8] 
write_sysrq_trigger+0xc8/0x168
[   74.783314] [c00000020f4ebcd0] [c000000000665d8c] 
proc_reg_write+0x10c/0x1b0
[   74.783321] [c00000020f4ebd00] [c00000000058da54] vfs_write+0x104/0x4b0
[   74.783326] [c00000020f4ebdc0] [c00000000058dfdc] ksys_write+0x7c/0x140
[   74.783331] [c00000020f4ebe10] [c000000000033a64] 
system_call_exception+0x144/0x3a0
[   74.783337] [c00000020f4ebe50] [c00000000000c554] 
system_call_common+0xf4/0x258
[   74.783343] --- interrupt: c00 at 0x7fffa0721594
[   74.783352] NIP:  00007fffa0721594 LR: 00007fffa0697bf4 CTR: 
0000000000000000
[   74.783364] REGS: c00000020f4ebe80 TRAP: 0c00   Not tainted 
(6.6.0-rc5pf-nr-cpus+)
[   74.783376] MSR:  800000000280f033 
<SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 28222202  XER: 00000000
[   74.783394] IRQMASK: 0
[   74.783394] GPR00: 0000000000000004 00007ffffc4b6800 00007fffa0807300 
0000000000000001
[   74.783394] GPR04: 000000013549ea60 0000000000000002 0000000000000010 
0000000000000000
[   74.783394] GPR08: 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[   74.783394] GPR12: 0000000000000000 00007fffa0abaf70 0000000040000000 
000000011a0f9798
[   74.783394] GPR16: 000000011a0f9724 000000011a097688 000000011a02ff70 
000000011a0fd568
[   74.783394] GPR20: 0000000135554bf0 0000000000000001 000000011a0aa478 
00007ffffc4b6a24
[   74.783394] GPR24: 00007ffffc4b6a20 000000011a0faf94 0000000000000002 
000000013549ea60
[   74.783394] GPR28: 0000000000000002 00007fffa08017a0 000000013549ea60 
0000000000000002
[   74.783440] NIP [00007fffa0721594] 0x7fffa0721594
[   74.783443] LR [00007fffa0697bf4] 0x7fffa0697bf4
[   74.783447] --- interrupt: c00
I'm in purgatory
[    0.000000] radix-mmu: Page sizes from device-tree:
[    0.000000] radix-mmu: Page size shift = 12 AP=0x0
[    0.000000] radix-mmu: Page size shift = 16 AP=0x5
[    0.000000] radix-mmu: Page size shift = 21 AP=0x1
[    0.000000] radix-mmu: Page size shift = 30 AP=0x2
[    0.000000] Activating Kernel Userspace Access Prevention
[    0.000000] Activating Kernel Userspace Execution Prevention
[    0.000000] radix-mmu: Mapped 0x0000000000000000-0x0000000000010000 
with 64.0 KiB pages (exec)
[    0.000000] radix-mmu: Mapped 0x0000000000010000-0x0000000000200000 
with 64.0 KiB pages
[    0.000000] radix-mmu: Mapped 0x0000000000200000-0x0000000020000000 
with 2.00 MiB pages
[    0.000000] radix-mmu: Mapped 0x0000000020000000-0x0000000022600000 
with 2.00 MiB pages (exec)
[    0.000000] radix-mmu: Mapped 0x0000000022600000-0x0000000040000000 
with 2.00 MiB pages
[    0.000000] radix-mmu: Mapped 0x0000000040000000-0x0000000180000000 
with 1.00 GiB pages
[    0.000000] radix-mmu: Mapped 0x0000000180000000-0x00000001a0000000 
with 2.00 MiB pages
[    0.000000] lpar: Using radix MMU under hypervisor
[    0.000000] Linux version 6.6.0-rc5pf-nr-cpus+ 
(root at ltcever7x0-lp1.aus.stglabs.ibm.com) (gcc (GCC) 8.5.0 20210514 (Red 
Hat 8.5.0-20), GNU ld version 2.30-123.el8) #3 SMP Mon Oct  9 11:07:
41 CDT 2023
[    0.000000] Found initrd at 0xc000000022e60000:0xc0000000248f08d8
[    0.000000] Hardware name: IBM,9043-MRX POWER10 (raw) 0x800200 
0xf000006 of:IBM,FW1060.00 (NM1060_016) hv:phyp pSeries
[    0.000000] printk: bootconsole [udbg0] enabled
[    0.000000] the round shift between dt seq and the cpu logic number: 56
[    0.000000] BUG: Unable to handle kernel data access on write at 
0xc0000001a0000000
[    0.000000] Faulting instruction address: 0xc000000022009c64
[    0.000000] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.000000] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[    0.000000] Modules linked in:
[    0.000000] CPU: 2 PID: 0 Comm: swapper Not tainted 
6.6.0-rc5pf-nr-cpus+ #3
[    0.000000] Hardware name:  POWER10 (raw)  hv:phyp pSeries
[    0.000000] NIP:  c000000022009c64 LR: c000000022009c54 CTR: 
c0000000201ff348
[    0.000000] REGS: c000000022aebb00 TRAP: 0300   Not tainted 
(6.6.0-rc5pf-nr-cpus+)
[    0.000000] MSR:  8000000000001033 <SF,ME,IR,DR,RI,LE> CR: 28222824  
XER: 00000001
[    0.000000] CFAR: c000000020031574 DAR: c0000001a0000000 DSISR: 
42000000 IRQMASK: 1
[    0.000000] GPR00: c000000022009ba0 c000000022aebda0 c0000000213d1300 
0000000000000004
[    0.000000] GPR04: 0000000000000001 c000000022aebbc0 c000000022aebbb8 
0000000000000000
[    0.000000] GPR08: 0000000000000001 c00000019ffffff8 000000000000003a 
c0000000229c8a78
[    0.000000] GPR12: 0000000000002000 c000000022e4a800 c0000000211d34b8 
c0000000211d3aa8
[    0.000000] GPR16: c0000000211d75a0 c0000000211d75b0 c0000000225f3b98 
0000000000000000
[    0.000000] GPR20: 0000000000000001 0000000000000001 0000000000000001 
0000000000000001
[    0.000000] GPR24: 0000000000000008 0000000000000000 0000000000000001 
c00000019ffffdc0
[    0.000000] GPR28: 0000000000000002 c000000022b368e0 c000000022aebe08 
0000000000000008
[    0.000000] NIP [c000000022009c64] smp_setup_cpu_maps+0x420/0x724
[    0.000000] LR [c000000022009c54] smp_setup_cpu_maps+0x410/0x724
[    0.000000] Call Trace:
[    0.000000] [c000000022aebda0] [c000000022009ba0] 
smp_setup_cpu_maps+0x35c/0x724 (unreliable)
[    0.000000] [c000000022aebeb0] [c00000002200a19c] setup_arch+0x1b8/0x54c
[    0.000000] [c000000022aebf30] [c000000022003f88] start_kernel+0xb0/0x768
[    0.000000] [c000000022aebfe0] [c00000002000d888] 
start_here_common+0x1c/0x20
[    0.000000] Code: 3929ffff 7f89e040 409c002c 7ec4b378 7f83e378 
4a027939 7f83e378 4a0278e5 e95b0018 3d22017d e929f028 7d4ac42c 
<7d49c12e> eb7b0000 7e99a378 4bffff3c
[    0.000000] ---[ end trace 0000000000000000 ]---
[    0.000000]
[    0.000000] Kernel panic - not syncing: Fatal exception
[    0.000000] Rebooting in 180 seconds..

However, the kdump kernel boots fine if the kernel crashes on CPU 0.

Thanks,
Sourabh Jain


On 09/10/23 17:00, Pingfan Liu wrote:
> In order to identify the boot cpu, its intserv[] should be recorded and
> checked in smp_setup_cpu_maps().
>
> smp_setup_cpu_maps() is shared between PPC64 and PPC32. Since PPC64 has
> already used boot_cpu_hwid to carry that information, enabling this
> variable on PPC32 so later it can also be used to carry that information
> for PPC32 in the coming patch.
>
> Signed-off-by: Pingfan Liu <piliu at redhat.com>
> Cc: Michael Ellerman <mpe at ellerman.id.au>
> Cc: Nicholas Piggin <npiggin at gmail.com>
> Cc: Christophe Leroy <christophe.leroy at csgroup.eu>
> Cc: Mahesh Salgaonkar <mahesh at linux.ibm.com>
> Cc: Wen Xiong <wenxiong at us.ibm.com>
> Cc: Baoquan He <bhe at redhat.com>
> Cc: Ming Lei <ming.lei at redhat.com>
> Cc: kexec at lists.infradead.org
> To: linuxppc-dev at lists.ozlabs.org
> ---
>   arch/powerpc/include/asm/smp.h     | 2 +-
>   arch/powerpc/kernel/prom.c         | 3 +--
>   arch/powerpc/kernel/setup-common.c | 2 --
>   3 files changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
> index aaaa576d0e15..5db9178cc800 100644
> --- a/arch/powerpc/include/asm/smp.h
> +++ b/arch/powerpc/include/asm/smp.h
> @@ -26,7 +26,7 @@
>   #include <asm/percpu.h>
>   
>   extern int boot_cpuid;
> -extern int boot_cpu_hwid; /* PPC64 only */
> +extern int boot_cpu_hwid;
>   extern int spinning_secondaries;
>   extern u32 *cpu_to_phys_id;
>   extern bool coregroup_enabled;
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index 0b5878c3125b..ec82f5bda908 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -372,8 +372,7 @@ static int __init early_init_dt_scan_cpus(unsigned long node,
>   	    be32_to_cpu(intserv[found_thread]));
>   	boot_cpuid = found;
>   
> -	if (IS_ENABLED(CONFIG_PPC64))
> -		boot_cpu_hwid = be32_to_cpu(intserv[found_thread]);
> +	boot_cpu_hwid = be32_to_cpu(intserv[found_thread]);
>   
>   	/*
>   	 * PAPR defines "logical" PVR values for cpus that
> diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
> index d2a446216444..1b19a9815672 100644
> --- a/arch/powerpc/kernel/setup-common.c
> +++ b/arch/powerpc/kernel/setup-common.c
> @@ -87,9 +87,7 @@ EXPORT_SYMBOL(machine_id);
>   int boot_cpuid = -1;
>   EXPORT_SYMBOL_GPL(boot_cpuid);
>   
> -#ifdef CONFIG_PPC64
>   int boot_cpu_hwid = -1;
> -#endif
>   
>   /*
>    * These are used in binfmt_elf.c to put aux entries on the stack




More information about the kexec mailing list