[PATCH 4/4] KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata

Marc Zyngier maz at kernel.org
Tue Jan 25 10:36:47 PST 2022


On Tue, 25 Jan 2022 18:19:45 +0000,
James Morse <james.morse at arm.com> wrote:
> 
> Hi Marc,
> 
> On 25/01/2022 16:51, Marc Zyngier wrote:
> > On Tue, 25 Jan 2022 15:38:03 +0000,
> > James Morse <james.morse at arm.com> wrote:
> >>
> >> Cortex-A510's erratum #2077057 causes SPSR_EL2 to be corrupted when
> >> single-stepping authenticated ERET instructions. A single step is
> >> expected, but a pointer authentication trap is taken instead. The
> >> erratum causes SPSR_EL1 to be copied to SPSR_EL2, which could allow
> >> EL1 to cause a return to EL2 with a guest controlled ELR_EL2.
> >>
> >> Because the conditions require an ERET into active-not-pending state,
> >> this is only a problem for the EL2 when EL2 is stepping EL1. In this case
> >> the previous SPSR_EL2 value is preserved in struct kvm_vcpu, and can be
> >> restored.
> 
> > Urgh. That's pretty nasty :-(.
> 
> >> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> >> index 331dd10821df..93bf140cc972 100644
> >> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> >> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> >> @@ -440,6 +442,22 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >>  			write_sysreg_el2(read_sysreg_el2(SYS_ELR) - 4, SYS_ELR);
> >>  	}
> >>  
> >> +	/*
> >> +	 * Check for the conditions of Cortex-A510's #2077057. When these occur
> >> +	 * SPSR_EL2 can't be trusted, but isn't needed either as it is
> >> +	 * unchanged from the value in vcpu_gp_regs(vcpu)->pstate.
> >> +	 * Did we just take a PAC exception when a step exception was expected?
> >> +	 */
> >> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_2077057) &&
> > 
> > nit: we can drop this IS_ENABLED()...
> 
> Hmmm, I thought dead code elimination was a good thing!
> Without the cpu_errata.c match, (which is also guarded by #ifdef),
> the cap will never be true, even if an affected CPU showed up. This
> way the compiler knows it can remove all this.

I don't dispute that. However, experience shows that the more of these
we sprinkle around, the quicker this code bitrots as we don't build
for all the possible configurations. In general, we hardly have any
dependency on configuration symbols, and rely on static keys got
things not be terribly sucky.

> 
> 
> >> +	    cpus_have_const_cap(ARM64_WORKAROUND_2077057) &&
> > 
> > and make this a final cap. Being a ARM64_CPUCAP_LOCAL_CPU_ERRATUM, we
> > won't accept late CPUs on a system that wasn't affected until then.
> > 
> >> +	    ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ &&
> >> +	    esr_ec == ESR_ELx_EC_PAC &&
> >> +	    vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
> >> +		/* Active-not-pending? */
> >> +		if (*vcpu_cpsr(vcpu) & DBG_SPSR_SS)
> >> +			write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> > 
> > Err... Isn't this way too late? The function starts with:
> > 
> > 	vcpu->arch.ctxt.regs.pstate = read_sysreg_el2(SYS_SPSR);
> > 
> > which is just another way to write:
> > 
> > 	*vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> > 
> > By that time, the vcpu's PSTATE is terminally corrupted.
> 
> Yes -  bother. Staring at it didn't let me spot that!
> I can hit the conditions to test this, but due to lack of
> imagination the model doesn't corrupt the SPSR.

Bad model. ;-)

> 
> 
> > I think you need to hoist this workaround way up, before we call into
> > early_exit_filter() as it will assume that the guest pstate is correct
> > (this is used by both pKVM and the NV code).
> > 
> > Something like this (untested):
> > 
> > diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> > index 93bf140cc972..a8a1502db237 100644
> > --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> > +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> > @@ -402,6 +402,26 @@ static inline bool kvm_hyp_handle_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >  	return false;
> >  }
> >  
> > +static inline void synchronize_vcpu_pstate(struct kvm_vcpu *vcpu,
> > +					   u64 *exit_code)
> > +{
> > +	/*
> > +	 * Check for the conditions of Cortex-A510's #2077057. When these occur
> > +	 * SPSR_EL2 can't be trusted, but isn't needed either as it is
> > +	 * unchanged from the value in vcpu_gp_regs(vcpu)->pstate.
> > +	 * Did we just take a PAC exception when a step exception (being in the
> > +	 * Active-not-pending state) was expected?
> > +	 */
> > +	if (cpus_have_final_cap(ARM64_WORKAROUND_2077057)	&&
> > +	    ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ	&&
> 
> > +	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_PAC	&&
> 
> The vcpu's esr_el2 isn't yet set:
> | ESR_ELx_EC(read_sysreg_el2(SYS_ESR)) == ESR_ELx_EC_PAC

Ah, well spotted!

> 
> (and I'll shuffle the order so this is last as its an extra sysreg read)
> 
> 
> > +	    vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP		&&
> > +	    *vcpu_cpsr(vcpu) & DBG_SPSR_SS)
> > +		write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> > +
> > +	*vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> > +}
> > +
> >  /*
> >   * Return true when we were able to fixup the guest exit and should return to
> >   * the guest, false when we should restore the host state and return to the
> > @@ -415,7 +435,7 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >  	 * Save PSTATE early so that we can evaluate the vcpu mode
> >  	 * early on.
> >  	 */
> > -	vcpu->arch.ctxt.regs.pstate = read_sysreg_el2(SYS_SPSR);
> > +	synchronize_vcpu_pstate(vcpu, exit_code);
> 
> Even better, that saves the noise from moving esr_ec around!

Yup, the patch becomes marginally smaller.

> 
> 
> > Other than that, I'm happy to take the series as a whole ASAP, if only
> > for the two pretty embarrassing bug fixes. If you can respin it
> > shortly and address the comments above, I'd like it into -rc2.
> 
> Will do. Shout if you strongly care about the IS_ENABLED().

I'd really like it gone. Otherwise, I'll never compile that code.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list