[PATCH 1/2] lib: sbi_init: Avoid thundering hurd problem with coldbook_lock

Anup Patel anup at brainfault.org
Sun Aug 16 09:59:21 EDT 2020


On Sun, Aug 16, 2020 at 1:18 AM Jessica Clarke <jrtc27 at jrtc27.com> wrote:
>
> On 15 Aug 2020, at 18:00, Anup Patel <anup.patel at wdc.com> wrote:
> >
> > We can have thundering hurd problem with coldboot_lock where the
> > boot HART can potentially starve trying to acquire coldboot_lock
> > because some of the non-boot HARTs are continuously acquiring and
> > releasing coldboot_lock. This can happen if WFI is a NOP
>
> That is neither necessary nor sufficient, it's solely based on
> whether the hart believes an M-mode software interrupt is pending?

Ahh yes, I will correct this statement. Thanks for pointing.

>
> > OR if
> > MIP.MSIP bit is already set for some of the non-boot HARTs.
> >
> > To avoid thundering hurd problem for coldboot_lock, we convert
> > coldboot_done flag into atomic variable and using coldboot_lock
> > only for coldboot_wait_hmask.
> >
> > Signed-off-by: Anup Patel <anup.patel at wdc.com>
> > Tested-by: Heinrich Schuchardt <xypron.glpk at gmx.de>
> > ---
> > lib/sbi/sbi_init.c | 19 ++++++++++++-------
> > 1 file changed, 12 insertions(+), 7 deletions(-)
> >
> > diff --git a/lib/sbi/sbi_init.c b/lib/sbi/sbi_init.c
> > index a7fb848..6b58983 100644
> > --- a/lib/sbi/sbi_init.c
> > +++ b/lib/sbi/sbi_init.c
> > @@ -85,9 +85,10 @@ static void sbi_boot_prints(struct sbi_scratch *scratch, u32 hartid)
> > }
> >
> > static spinlock_t coldboot_lock = SPIN_LOCK_INITIALIZER;
> > -static unsigned long coldboot_done = 0;
> > static struct sbi_hartmask coldboot_wait_hmask = { 0 };
> >
> > +static atomic_t coldboot_done = ATOMIC_INITIALIZER(0);
> > +
> > static void wait_for_coldboot(struct sbi_scratch *scratch, u32 hartid)
> > {
> >       unsigned long saved_mie, cmip;
> > @@ -105,16 +106,20 @@ static void wait_for_coldboot(struct sbi_scratch *scratch, u32 hartid)
> >       /* Mark current HART as waiting */
> >       sbi_hartmask_set_hart(hartid, &coldboot_wait_hmask);
> >
> > +     /* Release coldboot lock */
> > +     spin_unlock(&coldboot_lock);
> > +
> >       /* Wait for coldboot to finish using WFI */
> > -     while (!coldboot_done) {
> > -             spin_unlock(&coldboot_lock);
> > +     while (!atomic_read(&coldboot_done)) {
>
> Personally I'd make this a relaxed read and then explicitly fence
> outside the loop, as otherwise if we end up with MSIP erroneously set
> there may be a lot of cache coherency traffic due to repeated
> unnecessary fences?

Okay, good suggestion.

>
> >               do {
> >                       wfi();
> >                       cmip = csr_read(CSR_MIP);
> >                } while (!(cmip & MIP_MSIP));
> > -             spin_lock(&coldboot_lock);
> >       };
> >
> > +     /* Acquire coldboot lock */
> > +     spin_lock(&coldboot_lock);
> > +
> >       /* Unmark current HART as waiting */
> >       sbi_hartmask_clear_hart(hartid, &coldboot_wait_hmask);
> >
> > @@ -132,12 +137,12 @@ static void wake_coldboot_harts(struct sbi_scratch *scratch, u32 hartid)
> > {
> >       const struct sbi_platform *plat = sbi_platform_ptr(scratch);
> >
> > +     /* Mark coldboot done */
> > +     atomic_write(&coldboot_done, 1);
>
> This only needs to be a store release.

Agreed, store-release will be much better here.

Regards,
Anup



More information about the opensbi mailing list