[PATCH] lib: sbi: Revert entry_count before doing hsm stop in hsm wait loop

Nick Hu nick.hu at sifive.com
Wed May 28 03:26:07 PDT 2025


On Tue, May 27, 2025 at 8:48 PM Anup Patel <apatel at ventanamicro.com> wrote:
>
> Using hsm stop in hsm wait loop causes secondary harts to be stuck
> forever in OpenSBI on RISC-V platforms where HSM hart hotplug is
> available and all harts come-up at the same time during system
> power-on.
>
> For example, lets say we have two harts A and B on a RISC-V platform
> with HSM hart hotplug which come-up at the same time during system
> power-on. The hart A enters OpenSBI before hart B hence it becomes
> the primary (or cold-boot) hart whereas hart B becomes the secondary
> (or warm-boot) hart. The hart A follows the OpenSBI cold-boot path
> and registers hsm device before hart B enters OpenSBI. The hart B
> eventually enters OpenSBI and follows the OpenSBI warm-boot path
> so it will increment it's own entry_count before entering hsm wait
> loop where it sees hsm device and stops itself. Later as part of
> the Linux boot-up sequence, hart A issues SBI HSM start call to
> bring-up hart B but OpenSBI sees entry_count != init_count for
> hart B in sbi_hsm_hart_start() hence hsm_device_hart_start() is
> not called for hart B resulting in hart B stuck forever in OpenSBI.
>
> To fix the above issue, revert entry_count before doing hsm stop
> in hsm wait loop.
>
> Fixes: d844deadec94 ("lib: sbi: Use hsm stop for hsm wait")
> Signed-off-by: Anup Patel <apatel at ventanamicro.com>
> ---
>  include/sbi/sbi_init.h |  2 ++
>  lib/sbi/sbi_hsm.c      |  4 +++-
>  lib/sbi/sbi_init.c     | 13 +++++++++++++
>  3 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/include/sbi/sbi_init.h b/include/sbi/sbi_init.h
> index c9013ea4..ad068674 100644
> --- a/include/sbi/sbi_init.h
> +++ b/include/sbi/sbi_init.h
> @@ -16,6 +16,8 @@ struct sbi_scratch;
>
>  void __noreturn sbi_init(struct sbi_scratch *scratch);
>
> +void sbi_revert_entry_count(struct sbi_scratch *scratch);
> +
>  unsigned long sbi_entry_count(u32 hartindex);
>
>  unsigned long sbi_init_count(u32 hartindex);
> diff --git a/lib/sbi/sbi_hsm.c b/lib/sbi/sbi_hsm.c
> index e8128a39..557ab131 100644
> --- a/lib/sbi/sbi_hsm.c
> +++ b/lib/sbi/sbi_hsm.c
> @@ -176,8 +176,10 @@ static void sbi_hsm_hart_wait(struct sbi_scratch *scratch)
>                  * If the hsm_dev is ready and it support the hotplug, we can
>                  * use the hsm stop for more power saving
>                  */
> -               if (hsm_device_has_hart_hotplug())
> +               if (hsm_device_has_hart_hotplug()) {
> +                       sbi_revert_entry_count(scratch);
>                         hsm_device_hart_stop();

Hi Anup,
Thanks for addressing the bug =)

However, there may still be an issue in the following scenario:
If the HSM state of core B is SBI_HSM_STATE_START_PENDING at this
point, it won't enter the warm boot path. As a result, the
`init_count` will end up being `entry_count` + 1 after the warm init
path.

Would it make sense to increment `entry_count` after the
`sbi_hsm_init()` call instead?

> +               }
>
>                 wfi();
>         }
> diff --git a/lib/sbi/sbi_init.c b/lib/sbi/sbi_init.c
> index 62c32682..84a63748 100644
> --- a/lib/sbi/sbi_init.c
> +++ b/lib/sbi/sbi_init.c
> @@ -579,6 +579,19 @@ void __noreturn sbi_init(struct sbi_scratch *scratch)
>                 init_warmboot(scratch, hartid);
>  }
>
> +void sbi_revert_entry_count(struct sbi_scratch *scratch)
> +{
> +       unsigned long *entry_count, *init_count;
> +
> +       if (!entry_count_offset || !init_count_offset)
> +               sbi_hart_hang();
> +
> +       entry_count = sbi_scratch_offset_ptr(scratch, entry_count_offset);
> +       init_count = sbi_scratch_offset_ptr(scratch, init_count_offset);
> +
> +       *entry_count = *init_count;
> +}
> +
>  unsigned long sbi_entry_count(u32 hartindex)
>  {
>         struct sbi_scratch *scratch;
> --
> 2.43.0
>

Best Regards,
Nick



More information about the opensbi mailing list