[PATCH] lib: sbi: Revert entry_count before doing hsm stop in hsm wait loop
Nick Hu
nick.hu at sifive.com
Wed May 28 03:26:07 PDT 2025
On Tue, May 27, 2025 at 8:48 PM Anup Patel <apatel at ventanamicro.com> wrote:
>
> Using hsm stop in hsm wait loop causes secondary harts to be stuck
> forever in OpenSBI on RISC-V platforms where HSM hart hotplug is
> available and all harts come-up at the same time during system
> power-on.
>
> For example, lets say we have two harts A and B on a RISC-V platform
> with HSM hart hotplug which come-up at the same time during system
> power-on. The hart A enters OpenSBI before hart B hence it becomes
> the primary (or cold-boot) hart whereas hart B becomes the secondary
> (or warm-boot) hart. The hart A follows the OpenSBI cold-boot path
> and registers hsm device before hart B enters OpenSBI. The hart B
> eventually enters OpenSBI and follows the OpenSBI warm-boot path
> so it will increment it's own entry_count before entering hsm wait
> loop where it sees hsm device and stops itself. Later as part of
> the Linux boot-up sequence, hart A issues SBI HSM start call to
> bring-up hart B but OpenSBI sees entry_count != init_count for
> hart B in sbi_hsm_hart_start() hence hsm_device_hart_start() is
> not called for hart B resulting in hart B stuck forever in OpenSBI.
>
> To fix the above issue, revert entry_count before doing hsm stop
> in hsm wait loop.
>
> Fixes: d844deadec94 ("lib: sbi: Use hsm stop for hsm wait")
> Signed-off-by: Anup Patel <apatel at ventanamicro.com>
> ---
> include/sbi/sbi_init.h | 2 ++
> lib/sbi/sbi_hsm.c | 4 +++-
> lib/sbi/sbi_init.c | 13 +++++++++++++
> 3 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/include/sbi/sbi_init.h b/include/sbi/sbi_init.h
> index c9013ea4..ad068674 100644
> --- a/include/sbi/sbi_init.h
> +++ b/include/sbi/sbi_init.h
> @@ -16,6 +16,8 @@ struct sbi_scratch;
>
> void __noreturn sbi_init(struct sbi_scratch *scratch);
>
> +void sbi_revert_entry_count(struct sbi_scratch *scratch);
> +
> unsigned long sbi_entry_count(u32 hartindex);
>
> unsigned long sbi_init_count(u32 hartindex);
> diff --git a/lib/sbi/sbi_hsm.c b/lib/sbi/sbi_hsm.c
> index e8128a39..557ab131 100644
> --- a/lib/sbi/sbi_hsm.c
> +++ b/lib/sbi/sbi_hsm.c
> @@ -176,8 +176,10 @@ static void sbi_hsm_hart_wait(struct sbi_scratch *scratch)
> * If the hsm_dev is ready and it support the hotplug, we can
> * use the hsm stop for more power saving
> */
> - if (hsm_device_has_hart_hotplug())
> + if (hsm_device_has_hart_hotplug()) {
> + sbi_revert_entry_count(scratch);
> hsm_device_hart_stop();
Hi Anup,
Thanks for addressing the bug =)
However, there may still be an issue in the following scenario:
If the HSM state of core B is SBI_HSM_STATE_START_PENDING at this
point, it won't enter the warm boot path. As a result, the
`init_count` will end up being `entry_count` + 1 after the warm init
path.
Would it make sense to increment `entry_count` after the
`sbi_hsm_init()` call instead?
> + }
>
> wfi();
> }
> diff --git a/lib/sbi/sbi_init.c b/lib/sbi/sbi_init.c
> index 62c32682..84a63748 100644
> --- a/lib/sbi/sbi_init.c
> +++ b/lib/sbi/sbi_init.c
> @@ -579,6 +579,19 @@ void __noreturn sbi_init(struct sbi_scratch *scratch)
> init_warmboot(scratch, hartid);
> }
>
> +void sbi_revert_entry_count(struct sbi_scratch *scratch)
> +{
> + unsigned long *entry_count, *init_count;
> +
> + if (!entry_count_offset || !init_count_offset)
> + sbi_hart_hang();
> +
> + entry_count = sbi_scratch_offset_ptr(scratch, entry_count_offset);
> + init_count = sbi_scratch_offset_ptr(scratch, init_count_offset);
> +
> + *entry_count = *init_count;
> +}
> +
> unsigned long sbi_entry_count(u32 hartindex)
> {
> struct sbi_scratch *scratch;
> --
> 2.43.0
>
Best Regards,
Nick
More information about the opensbi
mailing list