[PATCH v7 1/7] KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots

Zhou Wang wangzhou1 at hisilicon.com
Mon Nov 10 18:12:07 PST 2025


On 2025/11/7 19:48, Suzuki K Poulose wrote:
> On 07/11/2025 07:21, Zhou Wang wrote:
>> From: Marc Zyngier <maz at kernel.org>
>>
>> The main use of {LD,ST}64B* is to talk to a device, which is hopefully
>> directly assigned to the guest and requires no additional handling.
>>
>> However, this does not preclude a VMM from exposing a virtual device
>> to the guest, and to allow 64 byte accesses as part of the programming
>> interface. A direct consequence of this is that we need to be able
>> to forward such access to userspace.
>>
>> Given that such a contraption is very unlikely to ever exist, we choose
>> to offer a limited service: userspace gets (as part of a new exit reason)
>> the ESR, the IPA, and that's it. It is fully expected to handle the full
>> semantics of the instructions, deal with ACCDATA, the return values and
>> increment PC. Much fun.
>>
>> A canonical implementation can also simply inject an abort and be done
>> with it. Frankly, don't try to do anything else unless you have time
>> to waste.
>>
>> Signed-off-by: Marc Zyngier <maz at kernel.org>
>> Signed-off-by: Yicong Yang <yangyicong at hisilicon.com>
>> Signed-off-by: Zhou Wang <wangzhou1 at hisilicon.com>
> 
> We also need to document this new EXIT reason here :
> 
> Documentation/virt/kvm/api.rst
> 
> 
>> ---
>>   arch/arm64/kvm/mmio.c    | 27 ++++++++++++++++++++++++++-
>>   include/uapi/linux/kvm.h |  3 ++-
>>   2 files changed, 28 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
>> index 54f9358c9e0e..2a6261abb647 100644
>> --- a/arch/arm64/kvm/mmio.c
>> +++ b/arch/arm64/kvm/mmio.c
>> @@ -159,6 +159,9 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>>       bool is_write;
>>       int len;
>>       u8 data_buf[8];
>> +    u64 esr;
>> +
>> +    esr = kvm_vcpu_get_esr(vcpu);
>>         /*
>>        * No valid syndrome? Ask userspace for help if it has
>> @@ -168,7 +171,7 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>>        * though, so directly deliver an exception to the guest.
>>        */
>>       if (!kvm_vcpu_dabt_isvalid(vcpu)) {
>> -        trace_kvm_mmio_nisv(*vcpu_pc(vcpu), kvm_vcpu_get_esr(vcpu),
>> +        trace_kvm_mmio_nisv(*vcpu_pc(vcpu), esr,
>>                       kvm_vcpu_get_hfar(vcpu), fault_ipa);
>>             if (vcpu_is_protected(vcpu))
>> @@ -185,6 +188,28 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>>           return -ENOSYS;
>>       }
>>   +    /*
>> +     * When (DFSC == 0b00xxxx || DFSC == 0b10101x) && DFSC != 0b0000xx
>> +     * ESR_EL2[12:11] describe the Load/Store Type. This allows us to
>> +     * punt the LD64B/ST64B/ST64BV/ST64BV0 instructions to luserspace,
> 
> minor nit: typo: s/luserspace/userspace/ 

Will fix this in next version.
> 
>> +     * which will have to provide a full emulation of these 4
>> +     * instructions.  No, we don't expect this do be fast.
>> +     *
>> +     * We rely on traps being set if the corresponding features are not
>> +     * enabled, so if we get here, userspace has promised us to handle
>> +     * it already.
>> +     */
>> +    switch (kvm_vcpu_trap_get_fault(vcpu)) {
>> +    case 0b000100 ... 0b001111:
>> +    case 0b101010 ... 0b101011:
> 
> Matches Arm ARM.

Here is mentioned in L.b D24.2.40(page 7526). It does not include 0b0000xx,
so first case in above code is "case 0b000100 ... 0b001111", just skip 0b0000xx.

> 
>> +        if (FIELD_GET(GENMASK(12, 11), esr)) {
>> +            run->exit_reason = KVM_EXIT_ARM_LDST64B;
>> +            run->arm_nisv.esr_iss = esr & ~(u64)ESR_ELx_FSC;
> 
> Any particular reason why we diverge from the NISV case, where the FSC is provided,> but not here ? May be this needs to be documented too.

NISV case and this case is different. NISV indicates whether the syndrome information
in ISS[23:14] is valid, bits[12:11](LST) indicates which LS64 instruction generated
the data abort. Not sure why did we mask FSC, seems that LST already offers related
information.

Best,
Zhou

> 
> Suzuki
> 
> 
>> +            run->arm_nisv.fault_ipa = fault_ipa;
>> +            return 0;
>> +        }
>> +    }
>> +
>>       /*
>>        * Prepare MMIO operation. First decode the syndrome data we get
>>        * from the CPU. Then try if some in-kernel emulation feels
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 52f6000ab020..d219946b96be 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -179,6 +179,7 @@ struct kvm_xen_exit {
>>   #define KVM_EXIT_LOONGARCH_IOCSR  38
>>   #define KVM_EXIT_MEMORY_FAULT     39
>>   #define KVM_EXIT_TDX              40
>> +#define KVM_EXIT_ARM_LDST64B      41
>>     /* For KVM_EXIT_INTERNAL_ERROR */
>>   /* Emulate instruction failed. */
>> @@ -401,7 +402,7 @@ struct kvm_run {
>>           } eoi;
>>           /* KVM_EXIT_HYPERV */
>>           struct kvm_hyperv_exit hyperv;
>> -        /* KVM_EXIT_ARM_NISV */
>> +        /* KVM_EXIT_ARM_NISV / KVM_EXIT_ARM_LDST64B */
>>           struct {
>>               __u64 esr_iss;
>>               __u64 fault_ipa;
> 
> 
> 
> 
> .



More information about the linux-arm-kernel mailing list