[PATCH 1/3] KVM: arm64: Fix S1PTW handling on RO memslots

Ard Biesheuvel ardb at kernel.org
Thu Dec 22 05:01:55 PST 2022


On Tue, 20 Dec 2022 at 21:09, Marc Zyngier <maz at kernel.org> wrote:
>
> A recent development on the EFI front has resulted in guests having
> their page tables baked in the firmware binary, and mapped into
> the IPA space as part as a read-only memslot.
>
> Not only this is legitimate, but it also results in added security,
> so thumbs up. However, this clashes mildly with our handling of a S1PTW
> as a write to correctly handle AF/DB updates to the S1 PTs, and results
> in the guest taking an abort it won't recover from (the PTs mapping the
> vectors will suffer freom the same problem...).
>
> So clearly our handling is... wrong.
>
> Instead, switch to a two-pronged approach:
>
> - On S1PTW translation fault, handle the fault as a read
>
> - On S1PTW permission fault, handle the fault as a write
>
> This is of no consequence to SW that *writes* to its PTs (the write
> will trigger a non-S1PTW fault), and SW that uses RO PTs will not
> use AF/DB anyway, as that'd be wrong.
>
> Only in the case described in c4ad98e4b72c ("KVM: arm64: Assume write
> fault on S1PTW permission fault on instruction fetch") do we end-up
> with two back-to-back faults (page being evicted and faulted back).
> I don't think this is a case worth optimising for.
>
> Fixes: c4ad98e4b72c ("KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch")
> Signed-off-by: Marc Zyngier <maz at kernel.org>
> Cc: stable at vger.kernel.org

Reviewed-by: Ard Biesheuvel <ardb at kernel.org>

I have tested this patch on my TX2 with one of the EFI builds in
question, and everything works as before (I never observed the issue
itself)

Regression-tested-by: Ard Biesheuvel <ardb at kernel.org>

For the record, the EFI build in question targets QEMU/mach-virt and
switches to a set of read-only page tables in emulated NOR flash
straight out of reset, so it can create and populate the real page
tables with MMU and caches enabled. EFI does not use virtual memory or
paging so managing access flags or dirty bits in hardware is unlikely
to add any value, and it is not being used at the moment. And given
that this is emulated NOR flash, any ordinary write to it tears down
the r/o memslot altogether, and kicks the NOR flash emulation in QEMU
into programming mode, which is fully based on MMIO emulation and does
not use a memslot at all. IOW, even if we could figure out what store
the PTW was attempting to do, it is always going to be rejected since
the r/o page tables can only be modified by 'programming' the NOR
flash sector.


> ---
>  arch/arm64/include/asm/kvm_emulate.h | 22 ++++++++++++++++++++--
>  1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 9bdba47f7e14..fd6ad8b21f85 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -373,8 +373,26 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
>
>  static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
>  {
> -       if (kvm_vcpu_abt_iss1tw(vcpu))
> -               return true;
> +       if (kvm_vcpu_abt_iss1tw(vcpu)) {
> +               /*
> +                * Only a permission fault on a S1PTW should be
> +                * considered as a write. Otherwise, page tables baked
> +                * in a read-only memslot will result in an exception
> +                * being delivered in the guest.
> +                *
> +                * The drawback is that we end-up fauling twice if the
> +                * guest is using any of HW AF/DB: a translation fault
> +                * to map the page containing the PT (read only at
> +                * first), then a permission fault to allow the flags
> +                * to be set.
> +                */
> +               switch (kvm_vcpu_trap_get_fault_type(vcpu)) {
> +               case ESR_ELx_FSC_PERM:
> +                       return true;
> +               default:
> +                       return false;
> +               }
> +       }
>
>         if (kvm_vcpu_trap_is_iabt(vcpu))
>                 return false;
> --
> 2.34.1
>



More information about the linux-arm-kernel mailing list