[PATCH 10/12] KVM: arm64: nv: Add SW walker for AT S1 emulation
Alexandru Elisei
alexandru.elisei at arm.com
Wed Jul 10 08:12:53 PDT 2024
Hi Marc,
On Mon, Jul 08, 2024 at 05:57:58PM +0100, Marc Zyngier wrote:
> In order to plug the brokenness of our current AT implementation,
> we need a SW walker that is going to... err.. walk the S1 tables
> and tell us what it finds.
>
> Of course, it builds on top of our S2 walker, and share similar
> concepts. The beauty of it is that since it uses kvm_read_guest(),
> it is able to bring back pages that have been otherwise evicted.
>
> This is then plugged in the two AT S1 emulation functions as
> a "slow path" fallback. I'm not sure it is that slow, but hey.
>
> Signed-off-by: Marc Zyngier <maz at kernel.org>
> [..]
> @@ -331,18 +801,17 @@ void __kvm_at_s1e01(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
> }
>
> if (!fail)
> - par = read_sysreg(par_el1);
> + par = read_sysreg_par();
> else
> par = SYS_PAR_EL1_F;
>
> + retry_slow = !fail;
> +
> vcpu_write_sys_reg(vcpu, par, PAR_EL1);
>
> /*
> - * Failed? let's leave the building now.
> - *
> - * FIXME: how about a failed translation because the shadow S2
> - * wasn't populated? We may need to perform a SW PTW,
> - * populating our shadow S2 and retry the instruction.
> + * Failed? let's leave the building now, unless we retry on
> + * the slow path.
> */
> if (par & SYS_PAR_EL1_F)
> goto nopan;
This is what follows after the 'if' statement above, and before the 'switch'
below:
/* No PAN? No problem. */
if (!(*vcpu_cpsr(vcpu) & PSR_PAN_BIT))
goto nopan;
When KVM is executing this statement, the following is true:
1. SYS_PAR_EL1_F is clear => the hardware translation table walk was successful.
2. retry_slow = true;
Then if the PAN bit is not set, the function jumps to the nopan label, and
performs a software translation table walk, even though the hardware walk
performed by AT was successful.
> @@ -354,29 +823,58 @@ void __kvm_at_s1e01(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
> switch (op) {
> case OP_AT_S1E1RP:
> case OP_AT_S1E1WP:
> + retry_slow = false;
> fail = check_at_pan(vcpu, vaddr, &par);
> break;
> default:
> goto nopan;
> }
>
> + if (fail) {
> + vcpu_write_sys_reg(vcpu, SYS_PAR_EL1_F, PAR_EL1);
> + goto nopan;
> + }
> +
> /*
> * If the EL0 translation has succeeded, we need to pretend
> * the AT operation has failed, as the PAN setting forbids
> * such a translation.
> - *
> - * FIXME: we hardcode a Level-3 permission fault. We really
> - * should return the real fault level.
> */
> - if (fail || !(par & SYS_PAR_EL1_F))
> - vcpu_write_sys_reg(vcpu, (0xf << 1) | SYS_PAR_EL1_F, PAR_EL1);
> -
> + if (par & SYS_PAR_EL1_F) {
> + u8 fst = FIELD_GET(SYS_PAR_EL1_FST, par);
> +
> + /*
> + * If we get something other than a permission fault, we
> + * need to retry, as we're likely to have missed in the PTs.
> + */
> + if ((fst & ESR_ELx_FSC_TYPE) != ESR_ELx_FSC_PERM)
> + retry_slow = true;
Shouldn't VCPU's PAR_EL1 register be updated here? As far as I can tell, at this
point the VCPU PAR_EL1 register has the result from the successful walk
performed by AT S1E1R or AT S1E1W in the first 'switch' statement.
Thanks,
Alex
> + } else {
> + /*
> + * The EL0 access succeded, but we don't have the full
> + * syndrom information to synthetize the failure. Go slow.
> + */
> + retry_slow = true;
> + }
> nopan:
> __mmu_config_restore(&config);
> out:
> local_irq_restore(flags);
>
> write_unlock(&vcpu->kvm->mmu_lock);
> +
> + /*
> + * If retry_slow is true, then we either are missing shadow S2
> + * entries, have paged out guest S1, or something is inconsistent.
> + *
> + * Either way, we need to walk the PTs by hand so that we can either
> + * fault things back, in or record accurate fault information along
> + * the way.
> + */
> + if (retry_slow) {
> + par = handle_at_slow(vcpu, op, vaddr);
> + vcpu_write_sys_reg(vcpu, par, PAR_EL1);
> + }
> }
More information about the linux-arm-kernel
mailing list