[PATCH] perf: RISCV: Fix panic on pmu overflow handler

Alexandre Ghiti alex at ghiti.fr
Tue Feb 27 02:36:17 PST 2024


Hi Fei,

On 27/02/2024 04:07, Fei Wu wrote:
> Sign extension of (1 << idx) from int is not desired when setting bits
> in unsigned long overflowed_ctrs, kernel panics if 31 is a valid lidx.
> This panic happens when 'perf record -e branches' on a sophgo machine.
>
> [  212.845953] epc : ffffffff80afc288 ra : ffffffff80afd310 sp : fffffff6e36928f0
> [  212.853474]  gp : ffffffff821f7f48 tp : ffffffd9033b9900 t0 : 0000002ad69e9978
> [  212.861069]  t1 : 000000000000002a t2 : ffffffff801764d2 s0 : fffffff6e3692ab0
> [  212.868637]  s1 : 0000000000000020 a0 : 0000000000000000 a1 : 0000000000000015
> [  212.876021]  a2 : 0000000000000000 a3 : 0000000000000015 a4 : 0000000000000020
> [  212.883482]  a5 : ffffffd7ff880640 a6 : 000000000005a569 a7 : ffffffffffffffd5
> [  212.891191]  s2 : 000000000000ffff s3 : 0000000000000000 s4 : ffffffd7ff880540
> [  212.898707]  s5 : 0000000000504d55 s6 : ffffffd902443000 s7 : ffffffff821fe1f8
> [  212.906329]  s8 : 000000007fffffff s9 : ffffffd7ff880540 s10: ffffffd9147a1098
> [  212.914151]  s11: 0000000080000000 t3 : 0000000000000003 t4 : ffffffff80186226
> [  212.921773]  t5 : ffffffff802455ca t6 : ffffffd9058900e8
> [  212.927300] status: 0000000200000100 badaddr: 0000000000000098 cause: 000000000000000d
> [  212.935575] [<ffffffff80afc288>] riscv_pmu_ctr_get_width_mask+0x8/0x60
> [  212.942391] [<ffffffff80079922>] handle_percpu_devid_irq+0x98/0x1e8
> [  212.948855] [<ffffffff80073d06>] generic_handle_domain_irq+0x28/0x36
> [  212.955521] [<ffffffff80481444>] riscv_intc_irq+0x36/0x4e
> [  212.961269] [<ffffffff80ca5fce>] handle_riscv_irq+0x4a/0x74
> [  212.967270] [<ffffffff80ca6afc>] do_irq+0x60/0x90
> [  212.972284] Code: b580 60a2 6402 5529 0141 8082 0013 0000 0013 0000 (6d5c) b783
> [  212.980036] ---[ end trace 0000000000000000 ]---
> [  212.984874] Kernel panic - not syncing: Fatal exception in interrupt
> [  212.991506] SMP: stopping secondary CPUs
> [  212.995964] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
>
> Signed-off-by: Fei Wu <fei2.wu at intel.com>
> ---
>   drivers/perf/riscv_pmu_sbi.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
> index 16acd4dcdb96..c87c459e52de 100644
> --- a/drivers/perf/riscv_pmu_sbi.c
> +++ b/drivers/perf/riscv_pmu_sbi.c
> @@ -731,14 +731,14 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev)
>   		/* compute hardware counter index */
>   		hidx = info->csr - CSR_CYCLE;
>   		/* check if the corresponding bit is set in sscountovf */
> -		if (!(overflow & (1 << hidx)))
> +		if (!(overflow & (1UL << hidx)))
>   			continue;
>   
>   		/*
>   		 * Keep a track of overflowed counters so that they can be started
>   		 * with updated initial value.
>   		 */
> -		overflowed_ctrs |= 1 << lidx;
> +		overflowed_ctrs |= 1UL << lidx;
>   		hw_evt = &event->hw;
>   		riscv_pmu_event_update(event);
>   		perf_sample_data_init(&data, 0, hw_evt->last_period);


That's a good catch. Do you mind using the BIT macro instead? And I see 
2 other instances of "1 <<" in this file, I think they are safe since we 
only have 32 hw perf counters, but I may be missing something: @Atish WDYT?

Thanks,

Alex





More information about the linux-riscv mailing list