[PATCH v2 1/6] arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB

Tue Dec 17 09:44:13 PST 2024

Hi,

On Tue, Dec 17, 2024 at 5:25 AM Julius Werner <jwerner at chromium.org> wrote:
>
> > > - Refactor max_bhb_k in spectre_bhb_loop_affected() to be a global
> > > instead, which starts out as zero, is updated by
> > > spectre_bhb_loop_affected(), and is directly read by
> > > spectre_bhb_patch_loop_iter() (could probably remove the `scope`
> > > argument from spectre_bhb_loop_affected() then).
> >
> > Refactoring "max_bhb_k" would be a general cleanup and not related to
> > anything else here, right?
> >
> > ...but the function is_spectre_bhb_affected() is called from
> > "cpu_errata.c" and has a scope. Can we guarantee that it's always
> > "SCOPE_LOCAL_CPU"? I tried reading through the code and it's
> > _probably_ SCOPE_LOCAL_CPU most of the time, but it doesn't seem worth
> > it to add an assumption here for a small cleanup.
> >
> > I'm not going to do this, though I will move "max_bhb_k" to be a
> > global for the suggestion below.
>
> If you make max_bhb_k a global, then whether you change all
> `spectre_bhb_loop_affected(SCOPE_SYSTEM)` calls to read the global
> directly or whether you keep it such that
> `spectre_bhb_loop_affected()` simply returns that global for
> SCOPE_SYSTEM makes no difference. I just think reading the global
> directly would look a bit cleaner. Calling a function that's called
> "...affected()" when you're really just trying to find out the K-value
> feels a bit odd.
>
> For is_spectre_bhb_affected(), I was assuming the change below where
> you combine all the `return true` paths, so the scope question
> wouldn't matter there.

Ah, right. OK.

> > > - Change the `return false` into `return true` at the end of
> > > is_spectre_bhb_affected (in fact, you can probably take out some of
> > > the other calls that result in returning true as well then)
> >
> > I'm not sure you can take out the other calls. Specifically, both
> > spectre_bhb_loop_affected() and is_spectre_bhb_fw_affected() _need_ to
> > be called for each CPU so that they update static globals, right?
> > Maybe we could get rid of the call to supports_clearbhb(), but that
> > _would_ change things in ways that are not obvious. Specifically I
> > could believe that someone could have backported "clear BHB" to their
> > core but their core is otherwise listed as "loop affected". That would
> > affect "max_bhb_k". Maybe (?) it doesn't matter in this case, but I'd
> > rather not mess with it unless someone really wants me to and is sure
> > it's safe.
>
> Yes, but spectre_bhb_enable_mitigation() already calls all those
> functions on its own again anyway, so I'm pretty sure the "must be
> called once for each CPU" part of spectre_bhb_loop_affected() is
> covered by that. (Besides, it would be really awful if they had made a
> function whose name starts with is_... have critical side-effects that
> break things when it doesn't get called.)

The existing predicates already do change globals before my patch and
changing that is outside of the scope of what I'm willing to entertain
with my patchset

Given that I'm not going to change the way the existing predicates
work, if I move the "fallback" setting `max_bhb_k` to 32 to
spectre_bhb_enable_mitigation() then when we set `max_bhb_k` becomes
inconsistent between recognized and unrecognized CPUs. One gets set in
the predicate and one doesn't. Even if it works, this inconsistency
feels like bad design to me. Also, setting `max_bhb_k` to the max at
the end of is_spectre_bhb_affected() would perhaps _help_ someone
realize that the predicate has side effects because they'd see it in
the function itself and not have to dig down.

I would also say that having `max_bhb_k` get set in an inconsistent
place opens us up for bugs in the future. Even if it works today, I
imagine someone could change things in the future such that
spectre_bhb_enable_mitigation() reads `max_bhb_k` and essentially
caches it (maybe it constructs an instruction based on it). If that
happened things could be subtly broken for the "unrecognized CPU" case
because the first CPU would "cache" the value without it having been
called on all CPUs.

In case you can't tell, I'm still not convinced and will plan to keep
setting `max_bhb_k = 32` in is_spectre_bhb_affected().

> > > - In spectre_bhb_enable_mitigations(), at the end of the long if-else
> > > chain, put a last else block that prints your WARN_ONCE(), sets the
> > > max_bhb_k global to 32, and then does the same stuff that the `if
> > > (spectre_bhb_loop_affected())` block would have installed (maybe
> > > factoring that out into a helper function called from both cases).
> >
> > ...or just reorder it so that the spectre_bhb_loop_affected() code is
> > last and it can be the "else". Then I can WARN_ONCE() if
> > spectre_bhb_loop_affected(). ...or I could just do the WARN_ONCE()
> > when I get to the end of is_spectre_bhb_affected() and set "max_bhb_k"
> > to 32 there. I'd actually rather do that so that "max_bhb_k" is
> > consistently set after is_spectre_bhb_affected() is called on all
> > cores regardless of whether or not some cores are unknown.
>
> Yeah, you can reorder the loops too. I don't feel like moving this
> into is_spectre_bhb_affected() would be a good idea. Functions with a
> predicate name like that really shouldn't have such side effects.
> Besides, I think is_spectre_bhb_affected() is probably called more
> often per CPU, both once from spectre_bhb_enable_mitigation() and by
> whatever calls the `matches` pointer in the cpu_errata struct.
> spectre_bhb_enable_mitigation() seems to be the function that's called
> once for each CPU on boot to install the correct mitigation, so that
> feels like the best spot to put the fallback logic to me.

As per above, while I agree that having predicate functions w/ side
effects is not ideal, that predates my patch series and I'd rather
things work consistently.

-Doug