[PATCH] riscv: fix potential panic during CPU hot-plug

Wed Nov 8 09:18:30 PST 2023

Dear Evan,

On 08.11.2023 17:47, Evan Green wrote:
> On Wed, Nov 8, 2023 at 3:29 AM Marek Szyprowski
> <m.szyprowski at samsung.com>  wrote:
>> Commit 584ea6564bca ("RISC-V: Probe for unaligned access speed") added a
>> test for unaligned access speed. It is being performed when given CPU is
>> onlined. Then, another test for misaligned access emulation has been
>> added in commit 71c54b3d169d ("riscv: report misaligned accesses
>> emulation to hwprobe"). This unaligned access speed doesn't really change
>> after the boot, so it is sufficient to do this test only once. This has
>> been partially added by commit c20d36cc2a20 ("riscv: don't probe
>> unaligned access speed if already done"), but this optimisation works
>> only if RISCV_HWPROBE_MISALIGNED_EMULATED is returned by the latter
>> check. Otherwise the 'misaligned_access_speed' pcpu varliable is
>> overwritten with RISCV_HWPROBE_MISALIGNED_UNKNOWN value, what makes the
>> first check to be always performed.
>>
>> Recently I've noticed that the first check introduced a regression in the
>> CPU hot-plug mechanism. This can be observed as a following panic on
>> QEmu:
>>
>> CPU1: off
>> cpu1: Ratio of byte access time to unaligned word access is 7.00, unaligned accesses are fast
>> CPU1: off
>> CPU1: failed to come online
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 215 at kernel/smp.c:437 __flush_smp_call_function_queue+0x90/0x292
>> Modules linked in:
>> CPU: 0 PID: 215 Comm: bash Not tainted 6.5.0-rc1+ #7524
>> Hardware name: riscv-virtio,qemu (DT)
>> epc : __flush_smp_call_function_queue+0x90/0x292
>>   ra : smpcfd_dying_cpu+0xe/0x1c
>> ...
>> [<ffffffff800c904a>] __flush_smp_call_function_queue+0x90/0x292
>> [<ffffffff800c9ae8>] smpcfd_dying_cpu+0xe/0x1c
>> [<ffffffff80012b2e>] cpuhp_invoke_callback+0x124/0x322
>> [<ffffffff800142f8>] _cpu_up+0x218/0x322
>> [<ffffffff8001445e>] cpu_up+0x5c/0x98
>> [<ffffffff80014bb8>] cpu_device_up+0x14/0x1c
>> [<ffffffff8066d68e>] cpu_subsys_online+0x10/0x18
>> [<ffffffff806680e8>] device_online+0x56/0x72
>> [<ffffffff80668190>] online_store+0x8c/0xae
>> [<ffffffff80662dde>] dev_attr_store+0xe/0x1a
>> [<ffffffff802b53bc>] sysfs_kf_write+0x2e/0x4c
>> [<ffffffff802b4750>] kernfs_fop_write_iter+0xe8/0x14e
>> [<ffffffff80230170>] vfs_write+0x2a4/0x3e0
>> [<ffffffff802303e8>] ksys_write+0x5e/0xc8
>> [<ffffffff80230460>] sys_write+0xe/0x16
>> [<ffffffff80a4a8de>] do_trap_ecall_u+0xe0/0xf4
>> [<ffffffff80003e8c>] ret_from_exception+0x0/0x64
>> irq event stamp: 12335
>> hardirqs last  enabled at (12335): [<ffffffff8007e542>] console_unlock+0x156/0x186
>> hardirqs last disabled at (12334): [<ffffffff8007e52c>] console_unlock+0x140/0x186
>> softirqs last  enabled at (12154): [<ffffffff80a560d0>] __do_softirq+0x3b0/0x470
>> softirqs last disabled at (12147): [<ffffffff80019340>] __irq_exit_rcu+0xa6/0xd0
>> ---[ end trace 0000000000000000 ]---
>> ------------[ cut here ]------------
>> kernel BUG at kernel/irq_work.c:245!
>> Kernel BUG [#1]
>> Modules linked in:
>> CPU: 0 PID: 215 Comm: bash Tainted: G        W          6.5.0-rc1+ #7524
>> Hardware name: riscv-virtio,qemu (DT)
>> epc : irq_work_run_list+0x30/0x32
>>   ra : irq_work_run+0x2a/0x4c
>> ...
>> [<ffffffff8011a346>] irq_work_run_list+0x30/0x32
>> [<ffffffff8011a372>] irq_work_run+0x2a/0x4c
>> [<ffffffff800c9aec>] smpcfd_dying_cpu+0x12/0x1c
>> [<ffffffff80012b2e>] cpuhp_invoke_callback+0x124/0x322
>> [<ffffffff800142f8>] _cpu_up+0x218/0x322
>> [<ffffffff8001445e>] cpu_up+0x5c/0x98
>> [<ffffffff80014bb8>] cpu_device_up+0x14/0x1c
>> [<ffffffff8066d68e>] cpu_subsys_online+0x10/0x18
>> [<ffffffff806680e8>] device_online+0x56/0x72
>> [<ffffffff80668190>] online_store+0x8c/0xae
>> [<ffffffff80662dde>] dev_attr_store+0xe/0x1a
>> [<ffffffff802b53bc>] sysfs_kf_write+0x2e/0x4c
>> [<ffffffff802b4750>] kernfs_fop_write_iter+0xe8/0x14e
>> [<ffffffff80230170>] vfs_write+0x2a4/0x3e0
>> [<ffffffff802303e8>] ksys_write+0x5e/0xc8
>> [<ffffffff80230460>] sys_write+0xe/0x16
>> [<ffffffff80a4a8de>] do_trap_ecall_u+0xe0/0xf4
>> [<ffffffff80003e8c>] ret_from_exception+0x0/0x64
>> Code: 8526 6084 f0ef f71f fce5 60e2 6442 64a2 6105 8082 (9002) 1101
>> ---[ end trace 0000000000000000 ]---
>> Kernel panic - not syncing: Fatal exception in interrupt
>> SMP: stopping secondary CPUs
>> SMP: failed to stop secondary CPUs 0-1
>> ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
>>
>> To avoid calling those checks in the CPU hot-plug paths again and again,
>> simply move the check at the beginning of the check_unaligned_access()
>> function and rely on the value determined during the system boot.
>>
>> Fixes: 584ea6564bca ("RISC-V: Probe for unaligned access speed")
>> Fixes: c20d36cc2a20 ("riscv: don't probe unaligned access speed if already done")
>> Fixes: 71c54b3d169d ("riscv: report misaligned accesses emulation to hwprobe")
>> Signed-off-by: Marek Szyprowski<m.szyprowski at samsung.com>
> Hi Marek,
> Thanks for the patch. I happened to spot that bug too
> (check_unaligned_access_emulated() clobbering the per-cpu variable
> back to unknown). Since I was rearranging that code to try to run the
> speed measurements in parallel, I moved that check around to hopefully
> solve the same issue you're reporting. Can you see if this patch also
> fixes your issue:
> https://lore.kernel.org/lkml/20231106225855.3121724-1-evan@rivosinc.com/
> . It's also in Palmer's for-next tree as 55e0bf49a0d0 ("RISC-V: Probe
> misaligned access speed in parallel").

Thanks for for the information. Yes, Your patch also fixes this issue, 
but it is a bit invasive (it perfectly fits for the -next material 
though). Maybe my little cleanup could be applied as a fix during -rc 
cycle? It has an advantage that it can be easily backported to v6.6-stable.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland