[PATCH 04/30] firmware: google: Convert regular spinlock into trylock on panic path
Evan Green
evgreen at chromium.org
Tue May 3 11:03:37 PDT 2022
On Wed, Apr 27, 2022 at 3:51 PM Guilherme G. Piccoli
<gpiccoli at igalia.com> wrote:
>
> Currently the gsmi driver registers a panic notifier as well as
> reboot and die notifiers. The callbacks registered are called in
> atomic and very limited context - for instance, panic disables
> preemption, local IRQs and all other CPUs that aren't running the
> current panic function.
>
> With that said, taking a spinlock in this scenario is a
> dangerous invitation for a deadlock scenario. So, we fix
> that in this commit by changing the regular spinlock with
> a trylock, which is a safer approach.
>
> Fixes: 74c5b31c6618 ("driver: Google EFI SMI")
> Cc: Ard Biesheuvel <ardb at kernel.org>
> Cc: David Gow <davidgow at google.com>
> Cc: Evan Green <evgreen at chromium.org>
> Cc: Julius Werner <jwerner at chromium.org>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli at igalia.com>
> ---
> drivers/firmware/google/gsmi.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/firmware/google/gsmi.c b/drivers/firmware/google/gsmi.c
> index adaa492c3d2d..b01ed02e4a87 100644
> --- a/drivers/firmware/google/gsmi.c
> +++ b/drivers/firmware/google/gsmi.c
> @@ -629,7 +629,10 @@ static int gsmi_shutdown_reason(int reason)
> if (saved_reason & (1 << reason))
> return 0;
>
> - spin_lock_irqsave(&gsmi_dev.lock, flags);
> + if (!spin_trylock_irqsave(&gsmi_dev.lock, flags)) {
> + rc = -EBUSY;
> + goto out;
> + }
gsmi_shutdown_reason() is a common function called in other scenarios
as well, like reboot and thermal trip, where it may still make sense
to wait to acquire a spinlock. Maybe we should add a parameter to
gsmi_shutdown_reason() so that you can get your change on panic, but
we don't convert other callbacks into try-fail scenarios causing us to
miss logs.
Though thinking more about it, is this really a Good Change (TM)? The
spinlock itself already disables interrupts, meaning the only case
where this change makes a difference is if the panic happens from
within the function that grabbed the spinlock (in which case the
callback is also likely to panic), or in an NMI that panics within
that window. The downside of this change is that if one core was
politely working through an event with the lock held, and another core
panics, we now might lose the panic log, even though it probably would
have gone through fine assuming the other core has a chance to
continue.
-Evan
More information about the linux-um
mailing list