[PATCH] KVM: arm64: Make vcpu flag updates non-preemptible

Mon Apr 17 04:40:26 PDT 2023

On Mon, Apr 17, 2023 at 10:36:29AM +0100, Marc Zyngier wrote:
> Per-vcpu flags are updated using a non-atomic RMW operation.
> Which means it is possible to get preempted between the read and
> write operations.
> 
> Another interesting thing to note is that preemption also updates
> flags, as we have some flag manipulation in both the load and put
> operations.
> 
> It is thus possible to lose information communicated by either
> load or put, as the preempted flag update will overwrite the flags
> when the thread is resumed. This is specially critical if either
> load or put has stored information which depends on the physical
> CPU the vcpu runs on.
> 
> This results in really elusive bugs, and kudos must be given to
> Mostafa for the long hours of debugging, and finally spotting
> the problem.
> 
> Fixes: e87abb73e594 ("KVM: arm64: Add helpers to manipulate vcpu flags among a set")
> Reported-by: Mostafa Saleh <smostafa at google.com>
> Signed-off-by: Marc Zyngier <maz at kernel.org>
> Cc: stable at vger.kernel.org
> ---
>  arch/arm64/include/asm/kvm_host.h | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index bcd774d74f34..d716cfd806e8 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -579,6 +579,19 @@ struct kvm_vcpu_arch {
>  		v->arch.flagset & (m);				\
>  	})
>  
> +/*
> + * Note that the set/clear accessors must be preempt-safe in order to
> + * avoid nesting them with load/put which also manipulate flags...
> + */
> +#ifdef __KVM_NVHE_HYPERVISOR__
> +/* the nVHE hypervisor is always non-preemptible */
> +#define __vcpu_flags_preempt_disable()
> +#define __vcpu_flags_preempt_enable()
> +#else
> +#define __vcpu_flags_preempt_disable()	preempt_disable()
> +#define __vcpu_flags_preempt_enable()	preempt_enable()
> +#endif

If it makes things cleaner, we could define local (empty) copies of these
preempt_*() macros at EL2 to save you having to wrap them here. Up to you.

>  #define __vcpu_set_flag(v, flagset, f, m)			\
>  	do {							\
>  		typeof(v->arch.flagset) *fset;			\
> @@ -586,9 +599,11 @@ struct kvm_vcpu_arch {
>  		__build_check_flag(v, flagset, f, m);		\
>  								\
>  		fset = &v->arch.flagset;			\
> +		__vcpu_flags_preempt_disable();			\
>  		if (HWEIGHT(m) > 1)				\
>  			*fset &= ~(m);				\
>  		*fset |= (f);					\
> +		__vcpu_flags_preempt_enable();			\
>  	} while (0)
>  
>  #define __vcpu_clear_flag(v, flagset, f, m)			\
> @@ -598,7 +613,9 @@ struct kvm_vcpu_arch {
>  		__build_check_flag(v, flagset, f, m);		\
>  								\
>  		fset = &v->arch.flagset;			\
> +		__vcpu_flags_preempt_disable();			\
>  		*fset &= ~(m);					\
> +		__vcpu_flags_preempt_enable();			\
>  	} while (0)
>  
>  #define vcpu_get_flag(v, ...)	__vcpu_get_flag((v), __VA_ARGS__)

Given that __vcpu_get_flag() is still preemptible, we should probably
add a READ_ONCE() in there when we access the relevant flags field. In
practice, they're all single-byte fields so it should be ok, but I think
the READ_ONCE() is still worthwhile.

Will