[PATCH] KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch

Will Deacon will at kernel.org
Fri Sep 11 11:59:13 EDT 2020

On Wed, Sep 09, 2020 at 10:05:27PM +0100, Marc Zyngier wrote:
> KVM currently assumes that an instruction abort can never be a write.
> This is in general true, except when the abort is triggered by
> a S1PTW on instruction fetch that tries to update the S1 page tables
> (to set AF, for example).
> This can happen if the page tables have been paged out and brought
> back in without seeing a direct write to them (they are thus marked
> read only), and the fault handling code will make the PT executable(!)
> instead of writable. The guest gets stuck forever.
> In these conditions, the permission fault must be considered as
> a write so that the Stage-1 update can take place. This is essentially
> the I-side equivalent of the problem fixed by 60e21a0ef54c ("arm64: KVM:
> Take S1 walks into account when determining S2 write faults").
> Update both kvm_is_write_fault() to return true on IABT+S1PTW, as well
> as kvm_vcpu_trap_is_iabt() to return false in the same conditions.
> Cc: stable at vger.kernel.org
> Signed-off-by: Marc Zyngier <maz at kernel.org>
> ---
> This could do with some cleanup (kvm_vcpu_dabt_iss1tw has nothing to do
> with data aborts), but I've chosen to keep the patch simple in order to
> ease backporting.
>  arch/arm64/include/asm/kvm_emulate.h | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index d21676409a24..33d7e16edaa3 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -480,7 +480,8 @@ static __always_inline u8 kvm_vcpu_trap_get_class(const struct kvm_vcpu *vcpu)
>  static inline bool kvm_vcpu_trap_is_iabt(const struct kvm_vcpu *vcpu)
>  {
> -	return kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW;
> +	return (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW &&
> +		!kvm_vcpu_dabt_iss1tw(vcpu));
>  }
>  static __always_inline u8 kvm_vcpu_trap_get_fault(const struct kvm_vcpu *vcpu)
> @@ -520,6 +521,9 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
>  static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
>  {
> +	if (kvm_vcpu_dabt_iss1tw(vcpu))
> +		return true;
> +

Hmm, I'm a bit uneasy about the interaction of this with
kvm_handle_guest_abort() if we take an S1PTW fault on instruction fetch
with our page-tables sitting in a read-only memslot. In this case, I
think we'll end up injecting a data abort into the guest instead of an
instruction abort. It hurts my brain thinking about it though.

Overall, I'd be inclined to:

  1. Rename kvm_vcpu_dabt_iss1tw() to kvm_vcpu_abt_iss1tw()

  2. Introduce something like kvm_is_exec_fault() as:

	return kvm_vcpu_trap_is_iabt() && !kvm_vcpu_abt_iss1tw();

  3. Use that new function in user_mem_abort() to assign 'exec_fault'

  4. Hack kvm_is_write_fault() as you have done above.

Which I _think_ should work (famous last words)...

The only nasty bit is that we then duplicate the kvm_vcpu_dabt_iss1tw()
check in both kvm_is_write_fault() and kvm_vcpu_dabt_iswrite(). Perhaps
we could remove the latter function in favour of the first? Anyway,
obviously this sort of cleanup isn't for stable.


More information about the linux-arm-kernel mailing list