[PATCH 10/12] KVM: arm64: nv: Add SW walker for AT S1 emulation

Alexandru Elisei alexandru.elisei at arm.com
Wed Jul 10 08:12:53 PDT 2024


Hi Marc,

On Mon, Jul 08, 2024 at 05:57:58PM +0100, Marc Zyngier wrote:
> In order to plug the brokenness of our current AT implementation,
> we need a SW walker that is going to... err.. walk the S1 tables
> and tell us what it finds.
> 
> Of course, it builds on top of our S2 walker, and share similar
> concepts. The beauty of it is that since it uses kvm_read_guest(),
> it is able to bring back pages that have been otherwise evicted.
> 
> This is then plugged in the two AT S1 emulation functions as
> a "slow path" fallback. I'm not sure it is that slow, but hey.
> 
> Signed-off-by: Marc Zyngier <maz at kernel.org>
> [..]
> @@ -331,18 +801,17 @@ void __kvm_at_s1e01(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
>  	}
>  
>  	if (!fail)
> -		par = read_sysreg(par_el1);
> +		par = read_sysreg_par();
>  	else
>  		par = SYS_PAR_EL1_F;
>  
> +	retry_slow = !fail;
> +
>  	vcpu_write_sys_reg(vcpu, par, PAR_EL1);
>  
>  	/*
> -	 * Failed? let's leave the building now.
> -	 *
> -	 * FIXME: how about a failed translation because the shadow S2
> -	 * wasn't populated? We may need to perform a SW PTW,
> -	 * populating our shadow S2 and retry the instruction.
> +	 * Failed? let's leave the building now, unless we retry on
> +	 * the slow path.
>  	 */
>  	if (par & SYS_PAR_EL1_F)
>  		goto nopan;

This is what follows after the 'if' statement above, and before the 'switch'
below:

        /* No PAN? No problem. */
        if (!(*vcpu_cpsr(vcpu) & PSR_PAN_BIT))
                goto nopan;

When KVM is executing this statement, the following is true:

1. SYS_PAR_EL1_F is clear => the hardware translation table walk was successful.
2. retry_slow = true;

Then if the PAN bit is not set, the function jumps to the nopan label, and
performs a software translation table walk, even though the hardware walk
performed by AT was successful.

> @@ -354,29 +823,58 @@ void __kvm_at_s1e01(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
>  	switch (op) {
>  	case OP_AT_S1E1RP:
>  	case OP_AT_S1E1WP:
> +		retry_slow = false;
>  		fail = check_at_pan(vcpu, vaddr, &par);
>  		break;
>  	default:
>  		goto nopan;
>  	}
>  
> +	if (fail) {
> +		vcpu_write_sys_reg(vcpu, SYS_PAR_EL1_F, PAR_EL1);
> +		goto nopan;
> +	}
> +
>  	/*
>  	 * If the EL0 translation has succeeded, we need to pretend
>  	 * the AT operation has failed, as the PAN setting forbids
>  	 * such a translation.
> -	 *
> -	 * FIXME: we hardcode a Level-3 permission fault. We really
> -	 * should return the real fault level.
>  	 */
> -	if (fail || !(par & SYS_PAR_EL1_F))
> -		vcpu_write_sys_reg(vcpu, (0xf << 1) | SYS_PAR_EL1_F, PAR_EL1);
> -
> +	if (par & SYS_PAR_EL1_F) {
> +		u8 fst = FIELD_GET(SYS_PAR_EL1_FST, par);
> +
> +		/*
> +		 * If we get something other than a permission fault, we
> +		 * need to retry, as we're likely to have missed in the PTs.
> +		 */
> +		if ((fst & ESR_ELx_FSC_TYPE) != ESR_ELx_FSC_PERM)
> +			retry_slow = true;

Shouldn't VCPU's PAR_EL1 register be updated here? As far as I can tell, at this
point the VCPU PAR_EL1 register has the result from the successful walk
performed by AT S1E1R or AT S1E1W in the first 'switch' statement.

Thanks,
Alex

> +	} else {
> +		/*
> +		 * The EL0 access succeded, but we don't have the full
> +		 * syndrom information to synthetize the failure. Go slow.
> +		 */
> +		retry_slow = true;
> +	}
>  nopan:
>  	__mmu_config_restore(&config);
>  out:
>  	local_irq_restore(flags);
>  
>  	write_unlock(&vcpu->kvm->mmu_lock);
> +
> +	/*
> +	 * If retry_slow is true, then we either are missing shadow S2
> +	 * entries, have paged out guest S1, or something is inconsistent.
> +	 *
> +	 * Either way, we need to walk the PTs by hand so that we can either
> +	 * fault things back, in or record accurate fault information along
> +	 * the way.
> +	 */
> +	if (retry_slow) {
> +		par = handle_at_slow(vcpu, op, vaddr);
> +		vcpu_write_sys_reg(vcpu, par, PAR_EL1);
> +	}
>  }



More information about the linux-arm-kernel mailing list