[PATCH v2 1/5] KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Mon Aug 23 08:00:03 PDT 2021


----- On Aug 20, 2021, at 6:49 PM, Sean Christopherson seanjc at google.com wrote:

> Invoke rseq's NOTIFY_RESUME handler when processing the flag prior to
> transferring to a KVM guest, which is roughly equivalent to an exit to
> userspace and processes many of the same pending actions.  While the task
> cannot be in an rseq critical section as the KVM path is reachable only
> by via ioctl(KVM_RUN), the side effects that apply to rseq outside of a
> critical section still apply, e.g. the current CPU needs to be updated if
> the task is migrated.
> 
> Clearing TIF_NOTIFY_RESUME without informing rseq can lead to segfaults
> and other badness in userspace VMMs that use rseq in combination with KVM,
> e.g. due to the CPU ID being stale after task migration.

Acked-by: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>

> 
> Fixes: 72c3c0fe54a3 ("x86/kvm: Use generic xfer to guest work function")
> Reported-by: Peter Foley <pefoley at google.com>
> Bisected-by: Doug Evans <dje at google.com>
> Cc: Shakeel Butt <shakeelb at google.com>
> Cc: Thomas Gleixner <tglx at linutronix.de>
> Cc: stable at vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc at google.com>
> ---
> kernel/entry/kvm.c |  4 +++-
> kernel/rseq.c      | 14 +++++++++++---
> 2 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c
> index 49972ee99aff..049fd06b4c3d 100644
> --- a/kernel/entry/kvm.c
> +++ b/kernel/entry/kvm.c
> @@ -19,8 +19,10 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu,
> unsigned long ti_work)
> 		if (ti_work & _TIF_NEED_RESCHED)
> 			schedule();
> 
> -		if (ti_work & _TIF_NOTIFY_RESUME)
> +		if (ti_work & _TIF_NOTIFY_RESUME) {
> 			tracehook_notify_resume(NULL);
> +			rseq_handle_notify_resume(NULL, NULL);
> +		}
> 
> 		ret = arch_xfer_to_guest_mode_handle_work(vcpu, ti_work);
> 		if (ret)
> diff --git a/kernel/rseq.c b/kernel/rseq.c
> index 35f7bd0fced0..6d45ac3dae7f 100644
> --- a/kernel/rseq.c
> +++ b/kernel/rseq.c
> @@ -282,9 +282,17 @@ void __rseq_handle_notify_resume(struct ksignal *ksig,
> struct pt_regs *regs)
> 
> 	if (unlikely(t->flags & PF_EXITING))
> 		return;
> -	ret = rseq_ip_fixup(regs);
> -	if (unlikely(ret < 0))
> -		goto error;
> +
> +	/*
> +	 * regs is NULL if and only if the caller is in a syscall path.  Skip
> +	 * fixup and leave rseq_cs as is so that rseq_sycall() will detect and
> +	 * kill a misbehaving userspace on debug kernels.
> +	 */
> +	if (regs) {
> +		ret = rseq_ip_fixup(regs);
> +		if (unlikely(ret < 0))
> +			goto error;
> +	}
> 	if (unlikely(rseq_update_cpu_id(t)))
> 		goto error;
> 	return;
> --
> 2.33.0.rc2.250.ged5fa647cd-goog

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com



More information about the linux-arm-kernel mailing list