[PATCH v2] KVM: arm64: Initialize VCPU mdcr_el2 before loading it

Wed Mar 31 11:48:12 BST 2021

Hi Marc,

On 3/30/21 8:57 PM, Marc Zyngier wrote:
> On Tue, 30 Mar 2021 18:49:54 +0100,
> Alexandru Elisei <alexandru.elisei at arm.com> wrote:
>> Hi Marc,
>>
>> On 3/30/21 6:13 PM, Alexandru Elisei wrote:
>>> [..]
>>>>> +}
>>>>> +
>>>>>  /**
>>>>>   * kvm_arm_reset_debug_ptr - reset the debug ptr to point to the vcpu state
>>>>>   */
>>>>> @@ -83,12 +137,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
>>>>>   * @vcpu:	the vcpu pointer
>>>>>   *
>>>>>   * This is called before each entry into the hypervisor to setup any
>>>>> - * debug related registers. Currently this just ensures we will trap
>>>>> - * access to:
>>>>> - *  - Performance monitors (MDCR_EL2_TPM/MDCR_EL2_TPMCR)
>>>>> - *  - Debug ROM Address (MDCR_EL2_TDRA)
>>>>> - *  - OS related registers (MDCR_EL2_TDOSA)
>>>>> - *  - Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
>>>>> + * debug related registers.
>>>>>   *
>>>>>   * Additionally, KVM only traps guest accesses to the debug registers if
>>>>>   * the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
>>>>> @@ -100,27 +149,14 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
>>>>>  
>>>>>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
>>>>>  {
>>>>> -	bool trap_debug = !(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY);
>>>>>  	unsigned long mdscr, orig_mdcr_el2 = vcpu->arch.mdcr_el2;
>>>>>  
>>>>>  	trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug);
>>>>>  
>>>>> -	/*
>>>>> -	 * This also clears MDCR_EL2_E2PB_MASK to disable guest access
>>>>> -	 * to the profiling buffer.
>>>>> -	 */
>>>>> -	vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK;
>>>>> -	vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
>>>>> -				MDCR_EL2_TPMS |
>>>>> -				MDCR_EL2_TPMCR |
>>>>> -				MDCR_EL2_TDRA |
>>>>> -				MDCR_EL2_TDOSA);
>>>>> +	kvm_arm_setup_mdcr_el2(vcpu, __this_cpu_read(mdcr_el2));
>>>>>  
>>>>>  	/* Is Guest debugging in effect? */
>>>>>  	if (vcpu->guest_debug) {
>>>>> -		/* Route all software debug exceptions to EL2 */
>>>>> -		vcpu->arch.mdcr_el2 |= MDCR_EL2_TDE;
>>>>> -
>>>>>  		/* Save guest debug state */
>>>>>  		save_guest_debug_regs(vcpu);
>>>>>  
>>>>> @@ -174,7 +210,6 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
>>>>>  
>>>>>  			vcpu->arch.debug_ptr = &vcpu->arch.external_debug_state;
>>>>>  			vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
>>>>> -			trap_debug = true;
>>>> There is something that slightly worries me here: there is now a
>>>> disconnect between flagging debug as dirty and setting the
>>>> trapping. And actually, you now check for KVM_ARM64_DEBUG_DIRTY and
>>>> set the trap bits *before* setting the dirty bit itself.
>>>>
>>>> Here, I believe you end up with guest/host confusion of breakpoints,
>>>> which isn't great. Or did I miss something?
>>> I'm sorry, but I don't understand what you mean. This is my understanding of what
>>> is happening.
>>>
>>> Without this patch, trap_debug is set to true and the KVM_ARM64_DEBUG_DIRTY flag
>>> is set if vcpu->guest_debug & KVM_GUESTDBG_USE_HW. Further down, trap debug is
>>> only used when computing mdcr_el2.
>>>
>>> With this patch, trap_debug is set to true if vcpu->guest_debug &
>>> KVM_GUESTDBG_USE_HW and it's also used for computing mdcr_el2, but this happens in
>>> kvm_arm_setup_mdcr_el2(), which is called at the start of kvm_arm_setup_debug().
>>> The KVM_ARM_DEBUG_DIRTY flags is still set in kvm_arm_setup_debug() if
>>> vcpu->guest_debug & KVM_GUESTDBG_USE_HW, like before.
>>>
>>> The guest never runs with the value computed in kvm_vcpu_first_run_init() unless
>>> it's identical with the value recomputed in kvm_arm_setup_debug().
>>>
>>> The only difference I see is that mdcr_el2 is computed at the start of
>>> kvm_arm_setup_debug(). I get the feeling I'm also missing something.
>> I think I understand what you mean, you are worried that we won't
>> set the bit in mdcr_el2 to trap debug in the same place where we set
>> the debug dirty flag.
> Yes, that's what I mean. The code is conceptually as such ATM:
>
> 	debug_trap = (something based on vcpu->flags);
> 	if (something else) {
> 		check stuff;
> 		vcpu->flags |= stuff;
> 		debug_trap = true;
> 	}
>
> 	if (debug_trap)
> 		set trap conditions;
>
> You are turning this into:
>
> 	debug_trap = (something based on vcpu->flags);
> 	if (debug_trap) {
> 		set trap conditions;
> 	}
> 	if (something else) {
> 		check stuff;
> 		vcpu->flags |= stuff;
> 	}
>
> which isn't the same thing. In your case, it probably works because of
> KVM_GUESTDBG_USE_HW, but that's really hard to follow, and we have had
> so many bugs in the debug code that it really needs to be kept as
> stupid as possible.
>
>> If that's the case, then I can move kvm_arm_setup_mdcr_el2 right
>> after the BUG_ON() and remove the KVM_GUESTDBG_USE_HW check because
>> the KVM_ARM_DEBUG_DIRTY would be already set.
> Yes, I think that'd be better.

Ok, I'll do that then.

>
>> Question though, if mdcr_el2 is tied to the debug dirty flag, we
>> ignore the flag here (code without this patch):
>>
>>     BUG_ON(!vcpu->guest_debug &&
>>         vcpu->arch.debug_ptr != &vcpu->arch.vcpu_debug_state);
>>
>>     /* Trap debug register access */
>>     if (trap_debug)
>>         vcpu->arch.mdcr_el2 |= MDCR_EL2_TDA;
>>
>>     /* If KDE or MDE are set, perform a full save/restore cycle. */
>>     if (vcpu_read_sys_reg(vcpu, MDSCR_EL1) & (DBG_MDSCR_KDE | DBG_MDSCR_MDE))
>>         vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
>>
>> I suppose there's something I don't understand yet about how this is
>> supposed to work.
> The idea (IIRC) is that if MDSCR_EL1.KDE or MDSCR_EL1.MDE are set,
> that's because the guest is currently debugging, and that we are
> better off saying that the debug state is dirty, forcing a
> save/restore cycle on entry.

That made things a bit clearer, thanks.

Thanks,

Alex

>
> You may want to dig into the git history for more accurate
> information...
>
> Hope this helps,
>
> 	M.
>