[PATCH] arm: KVM: Don't return PSCI_INVAL if waitqueue is inactive

Christoffer Dall christoffer.dall at linaro.org
Thu Dec 12 15:09:58 EST 2013


On Thu, Dec 12, 2013 at 09:23:00AM +0000, Marc Zyngier wrote:
> Hi Christoffer,
> 
> On 11/12/13 20:53, Christoffer Dall wrote:
> > On Wed, Nov 20, 2013 at 10:51:39AM -0800, Christoffer Dall wrote:
> >> The current KVM implementation of PSCI returns INVALID_PARAMETERS if the
> >> waitqueue for the corresponding CPU is not active.  This does not seem
> >> correct, since KVM should not care what the specific thread is doing,
> >> for example, user space may not have called KVM_RUN on this VCPU yet or
> >> the thread may be busy looping to user space because it received a
> >> signal; this is really up to the user space implementation.  We should
> >> simply clear the pause flag on the CPU and wake up the thread if it
> >> happens to be blocked for us.
> 
> Coming back to the PSCI spec (v0.2), it is clearly said that an error
> should be returned if the CPU is already ON (the error code is
> different, as KVM only implements v0.1 so far, but still...).
> 

ok, so we should check if the pause flag is set and report and error if
it's not, but not check the waitqueue.

> >> Further, the implementation seems to be racy when executing multiple
> >> VCPU threads.  There really isn't a reasonable user space programming
> >> scheme to ensure all secondary CPUs have reached kvm_vcpu_first_run_init
> >> before turning on the boot CPU.
> 
> Agreed.
> 
> >> It also does not make much sense to
> >> call into the PSCI code for a CPU that is turned off - after all, it
> >> cannot do anything if it is turned off and PSCI code could reasonably be
> >> written with the assumption that the VCPU is actually up, in some shape
> >> or form.
> 
> I find this to be debatable. While I agree with you that doing a CPU_OFF
> on something that has never ran is not the most gracious thing ever, it
> helps (or helped) keeping the code relatively monimal, and without too
> many actors messing with the pause flag.
> 

It's certainly debatable.  I personally found that it was hiding
information that you had to look at anyhow to understand how things
worked, and that it wasn't really beneficial, but I may have been a
little over zealous about my general statement about calling into PSCI
code.

> >> Therefore, set the pause flag on the vcpu at VCPU init time (which can
> >> reasonably be expected to be completed for all CPUs by user space before
> >> running any VCPUs) and clear both this flag and the feature (in case the
> >> feature can somehow get set again in the future) and ping the waitqueue
> >> on turning on a VCPU using PSCI.
> 
> Fair enough. See my comments below:
> 
> >>
> >> Cc: Marc Zyngier <marc.zyngier at arm.com>
> >> Reported-by: Peter Maydell <peter.maydell at linaro.org>
> >> Signed-off-by: Christoffer Dall <christoffer.dall at linaro.org>
> >> ---
> >>  arch/arm/kvm/arm.c  | 30 +++++++++++++++++++-----------
> >>  arch/arm/kvm/psci.c |  6 ++----
> >>  2 files changed, 21 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index e312e4a..1140e0e 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -478,15 +478,6 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
> >>  			return ret;
> >>  	}
> >>  
> >> -	/*
> >> -	 * Handle the "start in power-off" case by calling into the
> >> -	 * PSCI code.
> >> -	 */
> >> -	if (test_and_clear_bit(KVM_ARM_VCPU_POWER_OFF, vcpu->arch.features)) {
> >> -		*vcpu_reg(vcpu, 0) = KVM_PSCI_FN_CPU_OFF;
> >> -		kvm_psci_call(vcpu);
> >> -	}
> >> -
> >>  	return 0;
> >>  }
> >>  
> >> @@ -700,6 +691,24 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level,
> >>  	return -EINVAL;
> >>  }
> >>  
> >> +static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
> >> +					 struct kvm_vcpu_init *init)
> >> +{
> >> +	int ret;
> >> +
> >> +	ret = kvm_vcpu_set_target(vcpu, init);
> >> +	if (ret)
> >> +		return ret;
> >> +
> >> +	/*
> >> +	 * Handle the "start in power-off" case by marking the VCPU as paused.
> >> +	 */
> >> +	if (test_and_clear_bit(KVM_ARM_VCPU_POWER_OFF, vcpu->arch.features))
> >> +		vcpu->arch.pause = true;
> 
> Do we really need a test_and_clear? I know the original code used it,
> but I now fail to see a reason why.
> 

You mean why the atomic version?  I think it reads nicely, we need to
test a bit and we need to clear it, but we can use __test_and_clear_bit
instead if you prefer.

> >> +	return 0;
> >> +}
> >> +
> >>  long kvm_arch_vcpu_ioctl(struct file *filp,
> >>  			 unsigned int ioctl, unsigned long arg)
> >>  {
> >> @@ -713,8 +722,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
> >>  		if (copy_from_user(&init, argp, sizeof(init)))
> >>  			return -EFAULT;
> >>  
> >> -		return kvm_vcpu_set_target(vcpu, &init);
> >> -
> >> +		return kvm_arch_vcpu_ioctl_vcpu_init(vcpu, &init);
> >>  	}
> >>  	case KVM_SET_ONE_REG:
> >>  	case KVM_GET_ONE_REG: {
> >> diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
> >> index 0881bf1..2e72ef5 100644
> >> --- a/arch/arm/kvm/psci.c
> >> +++ b/arch/arm/kvm/psci.c
> >> @@ -59,10 +59,6 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
> >>  
> >>  	target_pc = *vcpu_reg(source_vcpu, 2);
> >>  
> >> -	wq = kvm_arch_vcpu_wq(vcpu);
> >> -	if (!waitqueue_active(wq))
> >> -		return KVM_PSCI_RET_INVAL;
> >> -
> 
> That I object to. Calling CPU_ON on a CPU that is already ON is an
> error, and should be reported as such.
> 

Good point, but that's not why I removed the check - see above.  The
check should be against the pause flag, not the state of the waitqueue.
I'll adjust the patch.

> >>  	kvm_reset_vcpu(vcpu);
> >>  
> >>  	/* Gracefully handle Thumb2 entry point */
> >> @@ -76,9 +72,11 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
> >>  		kvm_vcpu_set_be(vcpu);
> >>  
> >>  	*vcpu_pc(vcpu) = target_pc;
> >> +	clear_bit(KVM_ARM_VCPU_POWER_OFF, vcpu->arch.features);
> 
> Why do you clear that bit here? Given the above test_and_clear, it
> shouldn't be set anymore.
> 

No reason, it's unnecessary.  I'll remove it.

> >>  	vcpu->arch.pause = false;
> >>  	smp_mb();		/* Make sure the above is visible */
> >>  
> >> +	wq = kvm_arch_vcpu_wq(vcpu);
> >>  	wake_up_interruptible(wq);
> >>  
> >>  	return KVM_PSCI_RET_SUCCESS;
> >> -- 
> >> 1.8.4.3
> >>
> > 
> 
> Otherwise, I think this is a valuable cleanup.
> 

Thanks!
-- 
Christoffer



More information about the linux-arm-kernel mailing list