[RFC PATCH 1/3] arm64: kvm: Handle Asymmetric AArch32 systems

Mon Oct 12 11:32:40 EDT 2020

Hi Marc, Qais,

On 09/10/2020 13:48, Qais Yousef wrote:
> On 10/09/20 13:34, Marc Zyngier wrote:
>> On 2020-10-09 10:58, Qais Yousef wrote:

>>>>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>>>>> index b588c3b5c2f0..22ff3373d855 100644
>>>>> --- a/arch/arm64/kvm/arm.c
>>>>> +++ b/arch/arm64/kvm/arm.c
>>>>> @@ -644,6 +644,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>>>>>  	struct kvm_run *run = vcpu->run;
>>>>>  	int ret;
>>>>>
>>>>> +	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
>>>>> +		kvm_err("Illegal AArch32 mode at EL0, can't run.");
>>>>
>>>> No, we don't scream on the console in an uncontrolled way based on
>>>> illegal user input (yes, the VM *is* userspace).
>>>
>>> It seemed kind to print a good reason of what just happened.

>>>> Furthermore, you seem to deal with the same problem *twice*. See
>>>> below.
>>>
>>> It's done below because we could loop back into the guest again, so we
>>> force an
>>> exit then. Here to make sure if the VMM ignores the error value we
>>> returned
>>> earlier it can't force its way back in again.

>> Which we already handle if you do what I hinted at below.

Do we trust the VMM not to try and get out of this?

We sanity-check the SPSR values the VMM writes via set_one_reg() to prevent aarch32 on
systems that don't support it. It seems strange that if you can get the bad value out of
hardware: you can keep it.
Returning to aarch32 from EL2 on a CPU that doesn't support it is terrifying.

To avoid always testing on entry from user-space, we could add a
'vmm-fixed-bad-value-from-hardware' request type, and refactor check_vcpu_requests() to
allow it to force a guest exit. Any KVM_EXIT_FAIL_ENTRY can set this to ensure the VMM
doesn't have its fingers in its ears.

This means the VMM can fix this by set_one_reg()ing an exception to aarch64 if it really
wants to, but it can't restart the guest with the bad SPSR value.

>>>>> +		return -ENOEXEC;
>>>>> +	}
>>>>> +
>>>>>  	if (unlikely(!kvm_vcpu_initialized(vcpu)))
>>>>>  		return -ENOEXEC;
>>>>>
>>>>> @@ -804,6 +809,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>>>>>
>>>>>  		preempt_enable();
>>>>>
>>>>> +		/*
>>>>> +		 * For asym aarch32 systems we present a 64bit only system to
>>>>> +		 * the guest. But in case it managed somehow to escape that and
>>>>> +		 * enter 32bit mode, catch that and prevent it from running
>>>>> +		 * again.
>>>>
>>>> The guest didn't *escape* anything. It merely used the CPU as
>>>> designed.
>>>> The fact that the hypervisor cannot prevent the guest from using
>>>> AArch32
>>>> is an architectural defect.

>>> Because I probably didn't navigate my way correctly around the code.
>>> Mind
>>> expanding how to mark the vcpu as uninitialized? I have tried 2 ways
>>> in that effect but they were really horrible, so will abstain from
>>> sharing :-)
>>
>> You can try setting vcpu->arch.target to -1, which is already caught by
>> kvm_vcpu_initialized() right at the top of this function. This will

>> prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.

This doesn't reset SPSR, so this lets the VMM restart the guest with a bad value.

I think we should make it impossible to return to aarch32 from EL2 on these systems.

Thanks,

James

> That's actually one of the things I tried before ending with this patch.
> I thought it's really aggressive and unfriendly.
> 
> It does indeed cause qemu to abort out completely.
> 
>> +                       /* Reset target so it won't run again */
>> +                       vcpu->arch.target = -1;
>>