[PATCH v7 07/16] arm64: kvm: allows kvm cpu hotplug

James Morse james.morse at arm.com
Tue Apr 19 10:37:13 PDT 2016


Hi Marc, Takahiro,

On 19/04/16 17:03, Marc Zyngier wrote:
> On 01/04/16 17:53, James Morse wrote:
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index b5384311dec4..962904a443be 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -591,7 +587,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>  		/*
>>  		 * Re-check atomic conditions
>>  		 */
>> -		if (signal_pending(current)) {
>> +		if (unlikely(!__this_cpu_read(kvm_arm_hardware_enabled))) {
>> +			/* cpu has been torn down */
>> +			ret = 0;
>> +			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
>> +			run->fail_entry.hardware_entry_failure_reason
>> +					= (u64)-ENOEXEC;
> 
> This hunk makes me feel a bit uneasy. Having to check something that
> critical on the entry path is at least a bit weird. If we've reset EL2
> already, it means that we must have forced an exit on the guest to do so.

(To save anyone else digging: the story comes from v12 of the kexec copy of this
patch [0])


> So why do we hand the control back to KVM (or anything else) once we've
> nuked a CPU? I'd expect it to be put on some back-burner, never to
> return in this lifetime...

This looks like the normal reboot code being called in the middle of a running
system. Kexec calls kernel_restart_prepare(), which kicks each reboot notifier,
one of which is kvm_reboot(), which calls:
> on_each_cpu(hardware_disable_nolock, NULL, 1);

We have to give the CPU back as there may be other reboot notifiers, and kexec
hasn't yet shuffled onto the boot cpu.


How about moving this check into handle_exit()[1]?
Currently this sees ARM_EXCEPTION_IRQ, and tries to re-enter the guest, we can
test for kvm_rebooting, which is set by kvm's reboot notifier .... but this
would still break if another vcpu wakes from cond_resched() and sprints towards
__kvm_vcpu_run(). Can we move cond_resched() to immediately before handle_exit()?

I can't see a reason why this doesn't happen on the normal reboot path,
presumably we rely on user space to kill any running guests.



It looks like x86 uses the extable to work around this, their vmx_vcpu_run() has:
> 		__ex(ASM_VMX_VMLAUNCH) "\n\t"
Where __ex ends up calling ____kvm_handle_fault_on_reboot(), with a nearby comment:
> * Hardware virtualization extension instructions may fault if a
> * reboot turns off virtualization while processes are running.
> * Trap the fault and ignore the instruction if that happens.


Takahiro, any ideas/wisdom on this?


Thanks,

James

[0] http://lists.infradead.org/pipermail/kexec/2015-December/014953.html
[1] Untested(!) alternative.
====================================================
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 0e63047a9530..dfa3cc42ec89 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -562,11 +562,6 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct k
vm_run *run)
        ret = 1;
        run->exit_reason = KVM_EXIT_UNKNOWN;
        while (ret > 0) {
-               /*
-                * Check conditions before entering the guest
-                */
-               cond_resched();
-
                update_vttbr(vcpu->kvm);

                if (vcpu->arch.power_off || vcpu->arch.pause)
@@ -662,6 +657,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kv
m_run *run)

                preempt_enable();

+               cond_resched();
+
                ret = handle_exit(vcpu, run, ret);
        }

diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index eba89e42f0ed..cc562d9ff905 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -170,6 +170,12 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 {
        exit_handle_fn exit_handler;

+       if (kvm_rebooting) {
+               run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
+               run->fail_entry.hardware_entry_failure_reason = (u64)-ENOEXEC;
+               return 0;
+       }
+
        switch (exception_index) {
        case ARM_EXCEPTION_IRQ:
                return 1;
====================================================





More information about the linux-arm-kernel mailing list