[kvm-unit-tests PATCH 1/2] lib/on-cpus: Correct and simplify synchronization
Alexandru Elisei
alexandru.elisei at arm.com
Fri Oct 25 05:19:26 PDT 2024
Hi Drew,
I've been paging in all the on_cpu* machinery, and it occurred to me that
we have a chance to simplify the code and to remove a duplicate interface
by not exposing smp_boot_secondary() to the tests, as the on_cpu* functions
serve the same purpose. With this change, we can remove the entry argument
to smp_boot_secondary(), and the assembly for secondary_entry can be made
simpler by eliminating the branch to the entry function.
Do you think that would be something worth pursuing? I can have a look at
it.
There are exactly two places where smp_boot_secondary() is used: in
arm/psci.c and arm/gic.c, and from a quick glance it looks to me like those
can be replaced with one of the on_cpu* functions.
On Wed, Oct 23, 2024 at 03:17:20PM +0200, Andrew Jones wrote:
> get/put_on_cpu_info() were trying to provide per-cpu locking for
> the per-cpu on_cpu info, but they were flawed since they would
> always set the "lock" since they were treating test_and_set/clear
> as cmpxchg (which they're not). Just revert to a normal spinlock
Would you mind expanding on that a bit more?
>From my understanding of the code, on arm64, this is the call chain that I get
for get_on_cpu_info(cpu):
->get_on_cpu_info(cpu):
->!cpumask_test_and_set(cpu, on_cpu_info_lock)
->!test_and_set_bit(cpu, on_cpu_info_lock->bits):
return (old & mask) != 0;
'mask' always has the CPU bit set, which means that get_on_cpu_info() returns
true if and only if 'old' has the bit clear. I think that prevents a thread
getting the lock if it's already held, so from that point of view it does
function as a per target cpu spinlock. Have I misunderstood something?
Regardless of the above, on_cpu_async() is used like this:
on_cpu_async()
wait_for_synchronization()
so it doesn't make much sense to optimize for performance for the case were
multiple threads call on_cpu_async() concurrently, as they would need to
have to wait for synchronization anyway.
So yes, I'm totally in favour for replacing the per-cpu spinlock with a global
spinlock, even if the only reason is simplifying the code.
> to correct it. Also simplify the break case for on_cpu_async() -
> we don't care if func is NULL, we only care that the cpu is idle.
That makes sense.
> And, finally, add a missing barrier to on_cpu_async().
Might be worth explaining in the commit message why it was missing. Just in
case someone is looking at the code and isn't exactly sure why it's there.
>
> Fixes: 018550041b38 ("arm/arm64: Remove spinlocks from on_cpu_async")
> Signed-off-by: Andrew Jones <andrew.jones at linux.dev>
> ---
> lib/on-cpus.c | 36 +++++++++++-------------------------
> 1 file changed, 11 insertions(+), 25 deletions(-)
>
> diff --git a/lib/on-cpus.c b/lib/on-cpus.c
> index 892149338419..f6072117fa1b 100644
> --- a/lib/on-cpus.c
> +++ b/lib/on-cpus.c
> @@ -9,6 +9,7 @@
> #include <on-cpus.h>
> #include <asm/barrier.h>
> #include <asm/smp.h>
> +#include <asm/spinlock.h>
>
> bool cpu0_calls_idle;
>
> @@ -18,18 +19,7 @@ struct on_cpu_info {
> cpumask_t waiters;
> };
> static struct on_cpu_info on_cpu_info[NR_CPUS];
> -static cpumask_t on_cpu_info_lock;
> -
> -static bool get_on_cpu_info(int cpu)
> -{
> - return !cpumask_test_and_set_cpu(cpu, &on_cpu_info_lock);
> -}
> -
> -static void put_on_cpu_info(int cpu)
> -{
> - int ret = cpumask_test_and_clear_cpu(cpu, &on_cpu_info_lock);
> - assert(ret);
> -}
> +static struct spinlock lock;
>
> static void __deadlock_check(int cpu, const cpumask_t *waiters, bool *found)
> {
> @@ -81,18 +71,14 @@ void do_idle(void)
> if (cpu == 0)
> cpu0_calls_idle = true;
>
> - set_cpu_idle(cpu, true);
> - smp_send_event();
> -
> for (;;) {
> + set_cpu_idle(cpu, true);
> + smp_send_event();
> +
> while (cpu_idle(cpu))
> smp_wait_for_event();
> smp_rmb();
> on_cpu_info[cpu].func(on_cpu_info[cpu].data);
> - on_cpu_info[cpu].func = NULL;
> - smp_wmb();
I think the barrier is still needed. The barrier orderered the now removed
write func = NULL before the write set_cpu_idle(), but it also orderered
whatever writes func(data) performed before set_cpu_idle(cpu, true). This
matters for on_cpu(), where I think it's reasonable for the caller to
expect to observe the writes made by 'func' after on_cpu() returns.
If you agree that this is the correct approach, I think it's worth adding a
comment explaining it.
> - set_cpu_idle(cpu, true);
> - smp_send_event();
> }
> }
>
> @@ -110,17 +96,17 @@ void on_cpu_async(int cpu, void (*func)(void *data), void *data)
>
> for (;;) {
> cpu_wait(cpu);
> - if (get_on_cpu_info(cpu)) {
> - if ((volatile void *)on_cpu_info[cpu].func == NULL)
> - break;
> - put_on_cpu_info(cpu);
> - }
> + spin_lock(&lock);
> + if (cpu_idle(cpu))
> + break;
> + spin_unlock(&lock);
> }
>
> on_cpu_info[cpu].func = func;
> on_cpu_info[cpu].data = data;
> + smp_wmb();
Without this smp_wmb(), it is possible for the target CPU to read an
outdated on_cpu_info[cpu].data. So adding it is the right thing to do,
since it orders the writes to on_cpu_info before set_cpu_idle().
> set_cpu_idle(cpu, false);
> - put_on_cpu_info(cpu);
> + spin_unlock(&lock);
> smp_send_event();
I think a DSB is necessary before all the smp_send_event() calls in this
file. The DSB ensures that the stores to cpu_idle_mask will be observed by
the thread that is waiting on the WFE, otherwise it is theoretically
possible to get a deadlock (in practice this will never happen, because KVM
will be generating the events that cause WFE to complete):
CPU0: on_cpu_async(): CPU1: do_idle():
load CPU1_idle = true
//do stuff
store CPU1_idle=false
SEV
1: WFE
load CPU1_idle=true // old value, allowed
b 1b // deadlock
Also, it looks unusual to have smp_send_event() unpaired from set_cpu_idle().
Can't really point to anything being wrong about it though.
Thanks,
Alex
More information about the kvm-riscv
mailing list