[PATCH v2 5/6] arm/arm64: KVM: Turn off vcpus on PSCI shutdown/reboot
Marc Zyngier
marc.zyngier at arm.com
Mon Dec 8 05:19:15 PST 2014
On 08/12/14 12:58, Christoffer Dall wrote:
> On Mon, Dec 08, 2014 at 12:04:53PM +0000, Marc Zyngier wrote:
>> On 03/12/14 21:18, Christoffer Dall wrote:
>>> When a vcpu calls SYSTEM_OFF or SYSTEM_RESET with PSCI v0.2, the vcpus
>>> should really be turned off for the VM adhering to the suggestions in
>>> the PSCI spec, and it's the sane thing to do.
>>>
>>> Also, clarify the behavior and expectations for exits to user space with
>>> the KVM_EXIT_SYSTEM_EVENT case.
>>>
>>> Signed-off-by: Christoffer Dall <christoffer.dall at linaro.org>
>>> ---
>>> Documentation/virtual/kvm/api.txt | 9 +++++++++
>>> arch/arm/kvm/psci.c | 19 +++++++++++++++++++
>>> arch/arm64/include/asm/kvm_host.h | 1 +
>>> 3 files changed, 29 insertions(+)
>>>
>>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
>>> index 81f1b97..228f9cf 100644
>>> --- a/Documentation/virtual/kvm/api.txt
>>> +++ b/Documentation/virtual/kvm/api.txt
>>> @@ -2957,6 +2957,15 @@ HVC instruction based PSCI call from the vcpu. The 'type' field describes
>>> the system-level event type. The 'flags' field describes architecture
>>> specific flags for the system-level event.
>>>
>>> +Valid values for 'type' are:
>>> + KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the
>>> + VM. Userspace is not obliged to honour this, and if it does honour
>>> + this does not need to destroy the VM synchronously (ie it may call
>>> + KVM_RUN again before shutdown finally occurs).
>>> + KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM.
>>> + As with SHUTDOWN, userspace can choose to ignore the request, or
>>> + to schedule the reset to occur in the future and may call KVM_RUN again.
>>> +
>>> /* Fix the size of the union. */
>>> char padding[256];
>>> };
>>> diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
>>> index 09cf377..ae0bb91 100644
>>> --- a/arch/arm/kvm/psci.c
>>> +++ b/arch/arm/kvm/psci.c
>>> @@ -15,6 +15,7 @@
>>> * along with this program. If not, see <http://www.gnu.org/licenses/>.
>>> */
>>>
>>> +#include <linux/preempt.h>
>>> #include <linux/kvm_host.h>
>>> #include <linux/wait.h>
>>>
>>> @@ -166,6 +167,24 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
>>>
>>> static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type)
>>> {
>>> + int i;
>>> + struct kvm_vcpu *tmp;
>>> +
>>> + /*
>>> + * The KVM ABI specifies that a system event exit may call KVM_RUN
>>> + * again and may perform shutdown/reboot at a later time that when the
>>> + * actual request is made. Since we are implementing PSCI and a
>>> + * caller of PSCI reboot and shutdown expects that the system shuts
>>> + * down or reboots immediately, let's make sure that VCPUs are not run
>>> + * after this call is handled and before the VCPUs have been
>>> + * re-initialized.
>>> + */
>>> + kvm_for_each_vcpu(i, tmp, vcpu->kvm)
>>> + tmp->arch.pause = true;
>>> + preempt_disable();
>>> + force_vm_exit(cpu_all_mask);
>>> + preempt_enable();
>>> +
>>
>> I'm slightly uneasy about this force_vm_exit, as this is something that
>> is directly triggered by the guest. I suppose it is almost impossible to
>> find out which CPUs we're actually using...
>>
> Ah, you mean we should only IPI the CPUs that are actually running a
> VCPU belonging to this VM?
>
> I guess I could replace it with:
>
> kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> tmp->arch.pause = true;
> kvm_vcpu_kick(tmp);
> }
Ah, that's even simpler than I thought. Yeah, looks good to me.
>
> or a slightly more optimized "half-open-coded-kvm_vcpu_kick":
>
> me = get_cpu();
> kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> tmp->arch.pause = true;
> if (tmp->cpu != me && (unsigned)tmp->cpu < nr_cpu_ids &&
> cpu_online(tmp->cpu) && kvm_arch_vcpu_should_kick(tmp))
> smp_send_reschedule(tmp->cpu);
> }
>
> which should save us waking up vcpu threads that are parked on
> waitqueues. Not sure it's worth it, maybe it is for 100s of vcpu
> systems?
Probably not worth it at the moment.
> Can we actually replace force_vm_exit() with the more optimized
> open-coded version? That messes with VMID allocation so it really needs
> a lot of testing though...
VMID reallocation almost never occurs, and that's a system-wide event,
not triggered by a guest. I'd rather not mess with that just yet.
> Preferences?
I think your first version is very nice, provided that it doesn't
introduce any unforeseen regression.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
More information about the linux-arm-kernel
mailing list