[PATCH 1/4] arm64: alternative: wait for other CPUs before patching
Will Deacon
will at kernel.org
Mon Dec 13 05:27:11 PST 2021
On Mon, Dec 13, 2021 at 01:01:25PM +0000, Mark Rutland wrote:
> On Fri, Dec 10, 2021 at 02:49:59PM +0000, Catalin Marinas wrote:
> > On Fri, Dec 03, 2021 at 10:47:20AM +0000, Mark Rutland wrote:
> > > In __apply_alternatives_multi_stop() we have a "really simple polling
> > > protocol" to avoid patching code that is concurrently executed on other
> > > CPUs. Secondary CPUs wait for the boot CPU to signal that patching is
> > > complete, but the boot CPU doesn't wait for secondaries to enter the
> > > polling loop, and it's possible that patching starts while secondaries
> > > are still within the stop_machine logic.
> > >
> > > Let's fix this by adding a vaguely simple polling protocol where the
> > > boot CPU waits for secondaries to signal that they have entered the
> > > unpatchable stop function. We can use the arch_atomic_*() functions for
> > > this, as they are not patched with alternatives.
> > >
> > > At the same time, let's make `all_alternatives_applied` local to
> > > __apply_alternatives_multi_stop(), since it is only used there, and this
> > > makes the code a little clearer.
> >
> > Doesn't the stop_machine() mechanism wait for the CPUs to get in the
> > same state before calling our function or we need another stop at a
> > lower level in the arch code?
>
> The stop_machine() logic doesn't wait on the way in; it queues some work on
> each CPU sequentially to *enter* the stop function (in this case
> __apply_alternatives_multi_stop()), but there's no existing logic to ensrue
> that all CPUs have entered by some point. On the way out, stop_machine() waits
> for all CPUs to *exit* the stop function before returning.
>
> We need to synchronize on the way into __apply_alternatives_multi_stop() to be
> sure that no CPUs are executing bits which might be patched -- portions of
> stop_machine() itself, anything necessary to dequeue the work, or anything that
> might be running before the CPU spots the queued work.
I remember looking at this when I added the current polling loop for arm64
and, at the time, the loop around the state machine in multi_cpu_stop() was
straightforward enough that it wasn't a problem. Since then, however, it
appears to have grown a READ_ONCE(), so it looks like we'll need to so
something.
I'll have a look at your patches.
Will
More information about the linux-arm-kernel
mailing list