[PATCH] kvm: arm64: Add SVE support for nVHE.
Daniel Kiss
Daniel.Kiss at arm.com
Tue Feb 2 13:44:43 EST 2021
Thanks for the comments, I’m going to send v2 soon.
Thanks
Daniel
> On 26 Jan 2021, at 16:49, Dave Martin <Dave.Martin at arm.com> wrote:
>
> On Fri, Jan 22, 2021 at 06:21:21PM +0000, Marc Zyngier wrote:
>> Daniel,
>>
>> Please consider cc'ing the maintainer (me) as well as the KVM/arm64
>> reviewers (Julien, James, Suzuki) and the kvmarm list (all now Cc'd).
>>
>> On 2021-01-22 01:07, Daniel Kiss wrote:
>>> CPUs that support SVE are architecturally required to support the
>>> Virtualization Host Extensions (VHE), so far the kernel supported
>>> SVE alongside KVM with VHE enabled. In same cases it is desired to
>>> run nVHE config even when VHE is available.
>>> This patch add support for SVE for nVHE configuration too.
>>>
>>> In case of nVHE the system registers behave a bit differently.
>>> ZCR_EL2 defines the maximum vector length that could be set in ZCR_EL1
>>> effectively. To limit the vector length for the guest the ZCR_EL2 need
>>> to be set accordingly therefore it become part of the context.
>>
>> Not really. It's just part of the *hypervisor* state for this guest,
>> and not part of the guest state. Not different from HCR_EL2, for example.
>
> Also, ZCR_EL2 doesn't affect what can be written in ZCR_EL1, so this
> might be reworded to say that it just limits the effective vector length
> available to the guest.
Okay, found a way to switch the ZCR_EL2 without adding to the context.
>>> The sve_state will be loaded in EL2 so it need to be mapped and during
>>> the load ZCR_EL2 will control the vector length.
>>> Trap control is similar to the VHE case except the bit values are the
>>> opposite. ZCR_EL1 access trapping with VHE is ZEN value 0 but in case of
>>> nVHE the TZ need to be set 1 to trigger the exception. Trap control need
>>> to be respected during the context switch even in EL2.
>>
>> Isn't that exactly the same as FPSIMD accesses?
>
> (Yes, this isn't really new. It might be best to let the code speak for
> itself on that point, rather than trying to explain it in the commit
> message.)
>
>>
>>>
>>> Tested on FVP with a Linux guest VM that run with a different VL than
>>> the host system.
>>>
>>> This patch requires sve_set_vq from
>>> - arm64/sve: Rework SVE trap access to minimise memory access
>>
>> Care to add a pointer to this patch? This also shouldn't be part
>> of the commit message. When you repost it, please include the
>> other patch as a series unless it has already been merged by then.
Thanks, I will do like this next time.
By the new changes seems it is not even needed.
>>>
>>> Signed-off-by: Daniel Kiss <daniel.kiss at arm.com>
>>> ---
>>> arch/arm64/Kconfig | 7 ----
>>> arch/arm64/include/asm/fpsimd.h | 4 ++
>>> arch/arm64/include/asm/fpsimdmacros.h | 38 +++++++++++++++++
>>> arch/arm64/include/asm/kvm_host.h | 19 +++------
>>> arch/arm64/kernel/fpsimd.c | 11 +++++
>>> arch/arm64/kvm/arm.c | 5 ---
>>> arch/arm64/kvm/fpsimd.c | 22 +++++++---
>>> arch/arm64/kvm/hyp/fpsimd.S | 15 +++++++
>>> arch/arm64/kvm/hyp/include/hyp/switch.h | 48 ++++++++++++----------
>>> arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 26 ++++++++++++
>>> arch/arm64/kvm/hyp/nvhe/switch.c | 6 ++-
>>> arch/arm64/kvm/reset.c | 8 ++--
>>> 12 files changed, 153 insertions(+), 56 deletions(-)
>>>
>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>> index a6b5b7ef40ae..f17ab095e99f 100644
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -1692,7 +1692,6 @@ endmenu
>>> config ARM64_SVE
>>> bool "ARM Scalable Vector Extension support"
>>> default y
>>> - depends on !KVM || ARM64_VHE
>>> help
>>> The Scalable Vector Extension (SVE) is an extension to the AArch64
>>> execution state which complements and extends the SIMD functionality
>>> @@ -1721,12 +1720,6 @@ config ARM64_SVE
>>> booting the kernel. If unsure and you are not observing these
>>> symptoms, you should assume that it is safe to say Y.
>>>
>>> - CPUs that support SVE are architecturally required to support the
>>> - Virtualization Host Extensions (VHE), so the kernel makes no
>>> - provision for supporting SVE alongside KVM without VHE enabled.
>>> - Thus, you will need to enable CONFIG_ARM64_VHE if you want to support
>>> - KVM in the same kernel image.
>>> -
>>> config ARM64_MODULE_PLTS
>>> bool "Use PLTs to allow module memory to spill over into vmalloc area"
>>> depends on MODULES
>>> diff --git a/arch/arm64/include/asm/fpsimd.h
>>> b/arch/arm64/include/asm/fpsimd.h
>>> index e60aa4ebb351..e7889f4c5cef 100644
>>> --- a/arch/arm64/include/asm/fpsimd.h
>>> +++ b/arch/arm64/include/asm/fpsimd.h
>>> @@ -69,6 +69,10 @@ static inline void *sve_pffr(struct thread_struct
>>> *thread)
>>> extern void sve_save_state(void *state, u32 *pfpsr);
>>> extern void sve_load_state(void const *state, u32 const *pfpsr,
>>> unsigned long vq_minus_1);
>>> +extern void sve_save_state_nvhe(void *state, u32 *pfpsr,
>>> + unsigned long vq_minus_1);
>>> +extern void sve_load_state_nvhe(void const *state, u32 const *pfpsr,
>>> + unsigned long vq_minus_1);
>>
>> Why do we need two different entry points?
Those will use ZCR_EL2 to control the vector length instead of ZCR_EL1.
I’ll add some comments.
>>> extern void sve_flush_live(void);
>>> extern void sve_load_from_fpsimd_state(struct user_fpsimd_state const
>>> *state,
>>> unsigned long vq_minus_1);
>>> diff --git a/arch/arm64/include/asm/fpsimdmacros.h
>>> b/arch/arm64/include/asm/fpsimdmacros.h
>>> index af43367534c7..3a18172ee4f6 100644
>>> --- a/arch/arm64/include/asm/fpsimdmacros.h
>>> +++ b/arch/arm64/include/asm/fpsimdmacros.h
>>> @@ -205,6 +205,17 @@
>>> 921:
>>> .endm
>>>
>>> +/* Update ZCR_EL2.LEN with the new VQ */
>>> +.macro sve_load_vq_nvhe xvqminus1, xtmp, xtmp2
>>> + mrs_s \xtmp, SYS_ZCR_EL2
>>> + bic \xtmp2, \xtmp, ZCR_ELx_LEN_MASK
>>> + orr \xtmp2, \xtmp2, \xvqminus1
>>> + cmp \xtmp2, \xtmp
>>> + b.eq 922f
>>> + msr_s SYS_ZCR_EL2, \xtmp2 //self-synchronising
>>> +922:
>>> +.endm
>>> +
>>> /* Preserve the first 128-bits of Znz and zero the rest. */
>>> .macro _sve_flush_z nz
>>> _sve_check_zreg \nz
>>> @@ -230,6 +241,20 @@
>>> str w\nxtmp, [\xpfpsr, #4]
>>> .endm
>>>
>>> +.macro sve_save_nvhe nxbase, xpfpsr, xvqminus1, nxtmp, xtmp2
>>> + sve_load_vq_nvhe \xvqminus1, x\nxtmp, \xtmp2
>>> + _for n, 0, 31, _sve_str_v \n, \nxbase, \n - 34
>>> + _for n, 0, 15, _sve_str_p \n, \nxbase, \n - 16
>>> + _sve_rdffr 0
>>> + _sve_str_p 0, \nxbase
>>> + _sve_ldr_p 0, \nxbase, -16
>>> +
>>> + mrs x\nxtmp, fpsr
>>> + str w\nxtmp, [\xpfpsr]
>>> + mrs x\nxtmp, fpcr
>>> + str w\nxtmp, [\xpfpsr, #4]
>>> +.endm
>>
>> Please document what this macro does.
ACK.
>>
>>> +
>>> .macro sve_load nxbase, xpfpsr, xvqminus1, nxtmp, xtmp2
>>> sve_load_vq \xvqminus1, x\nxtmp, \xtmp2
>>> _for n, 0, 31, _sve_ldr_v \n, \nxbase, \n - 34
>>> @@ -242,3 +267,16 @@
>>> ldr w\nxtmp, [\xpfpsr, #4]
>>> msr fpcr, x\nxtmp
>>> .endm
>>> +
>>> +.macro sve_load_nvhe nxbase, xpfpsr, xvqminus1, nxtmp, xtmp2
>>> + sve_load_vq_nvhe \xvqminus1, x\nxtmp, \xtmp2
>>> + _for n, 0, 31, _sve_ldr_v \n, \nxbase, \n - 34
>>> + _sve_ldr_p 0, \nxbase
>>> + _sve_wrffr 0
>>> + _for n, 0, 15, _sve_ldr_p \n, \nxbase, \n - 16
>>> +
>>> + ldr w\nxtmp, [\xpfpsr]
>>> + msr fpsr, x\nxtmp
>>> + ldr w\nxtmp, [\xpfpsr, #4]
>>> + msr fpcr, x\nxtmp
>>> +.endm
>
> These macros are virtually the same as some existing macros. We don't
> want to maintain this kind of thing in two places if we can avoid it --
> can you try to keep them unified please?
Done.
>
>>> diff --git a/arch/arm64/include/asm/kvm_host.h
>>> b/arch/arm64/include/asm/kvm_host.h
>>> index 0cd9f0f75c13..8bfac6f52858 100644
>>> --- a/arch/arm64/include/asm/kvm_host.h
>>> +++ b/arch/arm64/include/asm/kvm_host.h
>>> @@ -137,6 +137,7 @@ enum vcpu_sysreg {
>>> ACTLR_EL1, /* Auxiliary Control Register */
>>> CPACR_EL1, /* Coprocessor Access Control */
>>> ZCR_EL1, /* SVE Control */
>>> + ZCR_EL2, /* SVE Control */
>>
>> No. This obviously isn't a guest register, and it has no purpose
>> in this array.
>
> Ack, the value in ZCR_EL2 should just be set based on
> vcpu->arch.sve_max_vl, which is locked after kvm_arm_vcpu_finalize().
> The guest can't change it, so we don't need to re-save this anywhere
> when exiting back to the host: we just need to restore the host value.
>
> We do need somewhere to save the host value for this register while the
> guest is running, though. The value is determined by
> head.S:init_el2_nvhe() (in the __init_el2_nvhe_sve macro), and isn't
> stored anywhere other than in ZCR_EL2 itself. I feel it's better just
> to save and restore that, rather than duplicating the logic that
> determines what it should be for the host -- though obvioulsy it would
> work either way.
>
>>
>>> TTBR0_EL1, /* Translation Table Base Register 0 */
>>> TTBR1_EL1, /* Translation Table Base Register 1 */
>>> TCR_EL1, /* Translation Control Register */
>>> @@ -283,6 +284,7 @@ struct vcpu_reset_state {
>>> struct kvm_vcpu_arch {
>>> struct kvm_cpu_context ctxt;
>>> void *sve_state;
>>> + void *sve_state_hyp;
>
> [...]
>
>>> unsigned int sve_max_vl;
>>>
>>> /* Stage 2 paging state used by the hardware on next switch */
>>> @@ -386,6 +388,10 @@ struct kvm_vcpu_arch {
>>> #define vcpu_sve_pffr(vcpu) ((void *)((char *)((vcpu)->arch.sve_state) +
>>> \
>>> sve_ffr_offset((vcpu)->arch.sve_max_vl)))
>>>
>>> +#define vcpu_sve_pffr_hyp(vcpu) \
>>> + ((void *)((char *)((vcpu)->arch.sve_state_hyp) + \
>>> + sve_ffr_offset((vcpu)->arch.sve_max_vl)))
>>
>> Why can't you have a single macro that embeds kern_hyp_va()?
Done.
>>
>>> +
>>> #define vcpu_sve_state_size(vcpu) ({ \
>>> size_t __size_ret; \
>>> unsigned int __vcpu_vq; \
>>> @@ -579,19 +585,6 @@ static inline void
>>> kvm_init_host_cpu_context(struct kvm_cpu_context *cpu_ctxt)
>>> ctxt_sys_reg(cpu_ctxt, MPIDR_EL1) = read_cpuid_mpidr();
>>> }
>>>
>>> -static inline bool kvm_arch_requires_vhe(void)
>>> -{
>>> - /*
>>> - * The Arm architecture specifies that implementation of SVE
>>> - * requires VHE also to be implemented. The KVM code for arm64
>>> - * relies on this when SVE is present:
>>> - */
>>> - if (system_supports_sve())
>>> - return true;
>>> -
>>> - return false;
>>> -}
>>> -
>>> void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu);
>>>
>>> static inline void kvm_arch_hardware_unsetup(void) {}
>>> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
>>> index 062b21f30f94..1f94c5ca4943 100644
>>> --- a/arch/arm64/kernel/fpsimd.c
>>> +++ b/arch/arm64/kernel/fpsimd.c
>>> @@ -308,6 +308,17 @@ static void fpsimd_save(void)
>>>
>>> if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
>>> if (system_supports_sve() && test_thread_flag(TIF_SVE)) {
>>> +
>>> + if (!has_vhe() && (sve_get_vl() != last->sve_vl)) {
>>> + /* last->sve_state could belong to a vcpu, nVHE
>>> + case the context is loaded in EL2 where ZCR_EL2
>>> + is used. In this context the ZCR_EL1 is used but
>>> + it could be different because it was not
>>> + set when the vcpu context is loaded. Let's set
>>> + it to the guest's VL.
>>> + */
>>
>> Please use the kernel comment style.
>>
>>> + sve_set_vq(sve_vq_from_vl(last->sve_vl) - 1);
>>> + }
>
> On the surface of it, this looks like it's in the wrong place -- there
> shouldn't really be any need for KVM-specific hacks in the core context
> switch code.
>
> Thinking about this, don't we always save out any guest state in
> kvm_arch_vcpu_put_fp(), no? That code calls
> fpsimd_save_and_flush_cpu_state(), so TIF_FOREIGN_FPSTATE should always
> be set when reenabling preemption and exiting the KVM run loop. The
> core context switch code should never see dangling guest state.
>
> There's potential for optimisation here -- we could avoid flushing the
> guest state, but just save it. This would allow the state to be reused
> if the next user of FPSIMD/SVE on this cpu is the KVM run loop. I
> think we figured that the extra complexity wasn't worth it, since
> exiting the guest is already costly and KVM does its best to avoid it
> -- the additional cost of reloading the guest vector regs may therefore
> not be worth caring about.
>
> (Disclaimer -- my understanding of the code may be a bit out of date, so
> shout if the above sounds suspicious!)
>
>
> So in conclusion, actually I'm wondering whether we need to do anything
> at all here. Which is just as well, since fudging ZCR.LEN here would be
> broken anyway: if it was already too small, the extra bits of the SVE
> vector registers could have been thrown away by the hardware already.
last->sve_vl will be set to the VL for the vcpu. in case of VHE it is fine because
the save runs with ZCR_EL2 which holds the maximum VL. Value in the ZCR_EL1
does not matter.
nVHE case the save_state will use ZCR_EL1 that is managed by the guest.
The guest is allowed to set it to other values and does if /proc/sys/abi/sve_default_vector_length
is set.
This can be managed in the kvm part, where ZCR_EL1 is saved.
>
>>> if (WARN_ON(sve_get_vl() != last->sve_vl)) {
>>> /*
>>> * Can't save the user regs, so current would
>>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>>> index c0ffb019ca8b..3d37f4ad2228 100644
>>> --- a/arch/arm64/kvm/arm.c
>>> +++ b/arch/arm64/kvm/arm.c
>>> @@ -1749,11 +1749,6 @@ int kvm_arch_init(void *opaque)
>>>
>>> in_hyp_mode = is_kernel_in_hyp_mode();
>>>
>>> - if (!in_hyp_mode && kvm_arch_requires_vhe()) {
>>> - kvm_pr_unimpl("CPU unsupported in non-VHE mode, not initializing\n");
>>> - return -ENODEV;
>>> - }
>>> -
>>> if (cpus_have_final_cap(ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE) ||
>>> cpus_have_final_cap(ARM64_WORKAROUND_1508412))
>>> kvm_info("Guests without required CPU erratum workarounds can
>>> deadlock system!\n" \
>>> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
>>> index 3e081d556e81..cfbc869caeeb 100644
>>> --- a/arch/arm64/kvm/fpsimd.c
>>> +++ b/arch/arm64/kvm/fpsimd.c
>>> @@ -42,6 +42,15 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>>> if (ret)
>>> goto error;
>>>
>>> + if (!has_vhe() && vcpu->arch.sve_state) {
>>> + ret = create_hyp_mappings(vcpu->arch.sve_state,
>>> + vcpu->arch.sve_state +
>>> + SVE_SIG_REGS_SIZE(sve_vq_from_vl(vcpu->arch.sve_max_vl)),
>>> + PAGE_HYP);
>>> + if (ret)
>>> + goto error;
>>> + vcpu->arch.sve_state_hyp = kern_hyp_va(vcpu->arch.sve_state);
>>
>> How useful is it to keep that duplicated pointer around, given that
>> we can compute it on the fly at a very low cost?
>>
>>> + }
>>> vcpu->arch.host_thread_info = kern_hyp_va(ti);
>>> vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
>>> error:
>>> @@ -111,8 +120,9 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
>>> if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
>>> fpsimd_save_and_flush_cpu_state();
>>>
>>> - if (guest_has_sve)
>>> + if (guest_has_sve && has_vhe())
>>> __vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_s(SYS_ZCR_EL12);
>>> +
>
> Spurious blank line?
Done
>
>>> } else if (host_has_sve) {
>>> /*
>>> * The FPSIMD/SVE state in the CPU has not been touched, and we
>>> @@ -121,10 +131,12 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
>>> * for EL0. To avoid spurious traps, restore the trap state
>>> * seen by kvm_arch_vcpu_load_fp():
>>> */
>
> With nVHE, the HYP exit code already restored CPACR_EL1, right?
>
> Might be worth adding a sentence explaining why skipping this code for
> nVHE is OK.
Done
>
>>> - if (vcpu->arch.flags & KVM_ARM64_HOST_SVE_ENABLED)
>>> - sysreg_clear_set(CPACR_EL1, 0, CPACR_EL1_ZEN_EL0EN);
>>> - else
>>> - sysreg_clear_set(CPACR_EL1, CPACR_EL1_ZEN_EL0EN, 0);
>>> + if (has_vhe()) {
>>> + if (vcpu->arch.flags & KVM_ARM64_HOST_SVE_ENABLED)
>>> + sysreg_clear_set(CPACR_EL1, 0, CPACR_EL1_ZEN_EL0EN);
>>> + else
>>> + sysreg_clear_set(CPACR_EL1, CPACR_EL1_ZEN_EL0EN, 0);
>>> + }
>>> }
>>>
>>> update_thread_flag(TIF_SVE,
>>> diff --git a/arch/arm64/kvm/hyp/fpsimd.S b/arch/arm64/kvm/hyp/fpsimd.S
>>> index 01f114aa47b0..15d47ebdd30a 100644
>>> --- a/arch/arm64/kvm/hyp/fpsimd.S
>>> +++ b/arch/arm64/kvm/hyp/fpsimd.S
>>> @@ -6,6 +6,7 @@
>>>
>>> #include <linux/linkage.h>
>>>
>>> +#include <asm/assembler.h>
>>> #include <asm/fpsimdmacros.h>
>>>
>>> .text
>>> @@ -19,3 +20,17 @@ SYM_FUNC_START(__fpsimd_restore_state)
>>> fpsimd_restore x0, 1
>>> ret
>>> SYM_FUNC_END(__fpsimd_restore_state)
>>> +
>>> +#ifdef CONFIG_ARM64_SVE
>>> +
>>> +SYM_FUNC_START(sve_save_state_nvhe)
>>> + sve_save_nvhe 0, x1, x2, 3, x4
>>> + ret
>>> +SYM_FUNC_END(sve_save_state_nvhe)
>>> +
>>> +SYM_FUNC_START(sve_load_state_nvhe)
>>> + sve_load_nvhe 0, x1, x2, 3, x4
>>> + ret
>>> +SYM_FUNC_END(sve_load_state_nvhe)
>>> +
>>> +#endif
>>> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h
>>> b/arch/arm64/kvm/hyp/include/hyp/switch.h
>>> index 1f875a8f20c4..a14b0d92d9ca 100644
>>> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
>>> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
>>> @@ -197,25 +197,18 @@ static inline bool __populate_fault_info(struct
>>> kvm_vcpu *vcpu)
>>> /* Check for an FPSIMD/SVE trap and handle as appropriate */
>>> static inline bool __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
>>> {
>>> - bool vhe, sve_guest, sve_host;
>>> + bool sve_guest, sve_host;
>>> u8 esr_ec;
>>>
>>> if (!system_supports_fpsimd())
>>> return false;
>>>
>>> - /*
>>> - * Currently system_supports_sve() currently implies has_vhe(),
>>> - * so the check is redundant. However, has_vhe() can be determined
>>> - * statically and helps the compiler remove dead code.
>>> - */
>>> - if (has_vhe() && system_supports_sve()) {
>>> + if (system_supports_sve()) {
>>> sve_guest = vcpu_has_sve(vcpu);
>>> sve_host = vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE;
>>> - vhe = true;
>>> } else {
>>> sve_guest = false;
>>> sve_host = false;
>>> - vhe = has_vhe();
>>> }
>>>
>>> esr_ec = kvm_vcpu_trap_get_class(vcpu);
>>> @@ -229,8 +222,7 @@ static inline bool __hyp_handle_fpsimd(struct
>>> kvm_vcpu *vcpu)
>>> return false;
>>>
>>> /* Valid trap. Switch the context: */
>>> -
>>> - if (vhe) {
>>> + if (has_vhe()) {
>
> You might still want to cache this.
>
> There's no way to tell the compiler that consecutive checks on the same
> static key yield the same result, so repeatedly calling has_vhe() may
> emit a small amount of redundant code, IIUC.
Done
>
>>> u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
>>>
>>> if (sve_guest)
>>> @@ -238,24 +230,30 @@ static inline bool __hyp_handle_fpsimd(struct
>>> kvm_vcpu *vcpu)
>>>
>>> write_sysreg(reg, cpacr_el1);
>>> } else {
>>> - write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
>>> - cptr_el2);
>>> + u64 reg = read_sysreg(cptr_el2) & ~CPTR_EL2_TFP ;
>
> Spurious space before ;
>
> Also, you changed ~(u64)CPTR_EL2_TFP to ~CPTR_EL2_TFP here. Does it
> make a difference?
Done
>
>>> + if (sve_guest)
>>> + reg &= ~CPTR_EL2_TZ;
>
> Same here.
>
>>> + write_sysreg(reg, cptr_el2);
>>> }
>>>
>>> isb();
>>>
>>> if (vcpu->arch.flags & KVM_ARM64_FP_HOST) {
>>> - /*
>>> - * In the SVE case, VHE is assumed: it is enforced by
>>> - * Kconfig and kvm_arch_init().
>>> - */
>>> if (sve_host) {
>>> struct thread_struct *thread = container_of(
>>> vcpu->arch.host_fpsimd_state,
>>> struct thread_struct, uw.fpsimd_state);
>>> + if (has_vhe())
>>> + sve_save_state(sve_pffr(thread),
>>> + &vcpu->arch.host_fpsimd_state->fpsr);
>>> + else {
>>> + struct kvm_cpu_context *host_ctxt =
>>> + &this_cpu_ptr(&kvm_host_data)->host_ctxt;
>>
>> Assignments on a single line please.
>>
>>> + sve_save_state_nvhe(sve_pffr(thread),
>>> + &vcpu->arch.host_fpsimd_state->fpsr,
>>> + ctxt_sys_reg(host_ctxt, ZCR_EL2));
>>> + }
>
> Does this actually need to be different from the VHE case, if we just
> leave ZCR_EL2 permanently set with the maxmium VL allowed for the guest?
>
>>>
>>> - sve_save_state(sve_pffr(thread),
>>> - &vcpu->arch.host_fpsimd_state->fpsr);
>>> } else {
>>> __fpsimd_save_state(vcpu->arch.host_fpsimd_state);
>>> }
>>> @@ -264,10 +262,18 @@ static inline bool __hyp_handle_fpsimd(struct
>>> kvm_vcpu *vcpu)
>>> }
>>>
>>> if (sve_guest) {
>>> - sve_load_state(vcpu_sve_pffr(vcpu),
>>> + if (has_vhe()) {
>>> + sve_load_state(vcpu_sve_pffr(vcpu),
>>> &vcpu->arch.ctxt.fp_regs.fpsr,
>>> sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
>>> - write_sysreg_s(__vcpu_sys_reg(vcpu, ZCR_EL1), SYS_ZCR_EL12);
>>> + write_sysreg_s(__vcpu_sys_reg(vcpu, ZCR_EL1), SYS_ZCR_EL12);
>>> + } else {
>>> + sve_load_state_nvhe(vcpu_sve_pffr_hyp(vcpu),
>>> + &vcpu->arch.ctxt.fp_regs.fpsr,
>>> + sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
>>> + write_sysreg_s(__vcpu_sys_reg(vcpu, ZCR_EL2), SYS_ZCR_EL2);
>>
>> I think you really need to explain why you restore ZCR_EL2 here, and not
>> ZCR_EL1.
>
> Something doesn't smell quite right here. ZCR_EL2.LEN determines how
> regs are saved/restored, so for nVHE I think it needs to change as
> follows:
>
> * In host, after saving guest SVE regs:
>
> ZCR_EL2.LEN = ZCR_ELx_LEN_MASK (from __init_el2_nvhe_sve)
>
> * Before saving the host SVE regs here:
>
> ZCR_EL2.LEN = host ZCR_EL1.LEN
> (since the size and layout of current->thread.sve_state is VL dependent)
>
> * Before loading the guest SVE regs, and before running in the guest with
> SVE untrapped:
>
> ZCR_EL2.LEN = sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1
>
>
> I don't quite see how the code achieves this, though I might be missing
> something.
>
>
> The alternative would be to dump the registers in a common layout by
> setting ZCR_EL2.LEN permanently ZCR_ELx_LEN_MASK except when actually
> running the guest. But that may tend to waste a bit of memory, and it
> would be necessary to save the host regs in a side buffer and convert
> them to host layout when exiting the guest.
>
> This would avoid slowness arising from saving regs immediately after a
> ZCR.LEN write, but would add complexity elsewhere.
>
>>
>>> +
>>> + }
>>> } else {
>>> __fpsimd_restore_state(&vcpu->arch.ctxt.fp_regs);
>>> }
>>> diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
>>> b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
>>> index cce43bfe158f..45c16b81b826 100644
>>> --- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
>>> +++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
>>> @@ -46,6 +46,19 @@ static inline void __sysreg_save_el1_state(struct
>>> kvm_cpu_context *ctxt)
>>> ctxt_sys_reg(ctxt, PAR_EL1) = read_sysreg_par();
>>> ctxt_sys_reg(ctxt, TPIDR_EL1) = read_sysreg(tpidr_el1);
>>>
>>> + if (!has_vhe()) {
>>> + u64 reg = read_sysreg(cptr_el2);
>>> + if (reg & CPTR_EL2_TZ) {
>>> + write_sysreg(reg & ~CPTR_EL2_TZ, cptr_el2);
>>> + ctxt_sys_reg(ctxt, ZCR_EL1) = read_sysreg_s(SYS_ZCR_EL1);
>>> + ctxt_sys_reg(ctxt, ZCR_EL2) = read_sysreg_s(SYS_ZCR_EL2);
>>> + write_sysreg(reg, cptr_el2);
>>> + } else {
>>> + ctxt_sys_reg(ctxt, ZCR_EL1) = read_sysreg_s(SYS_ZCR_EL1);
>>> + ctxt_sys_reg(ctxt, ZCR_EL2) = read_sysreg_s(SYS_ZCR_EL2);
>>
>> Why do you need to save ZCR_EL2? It cannot change behind your back, can it?
>> Also, why aren't these registers saved lazily? If TZ is set, the guest
>> cannot
>> possibly have changed ZCR_EL1.
>>
>>> + }
>>> + }
>>> +
>>> ctxt_sys_reg(ctxt, SP_EL1) = read_sysreg(sp_el1);
>>> ctxt_sys_reg(ctxt, ELR_EL1) = read_sysreg_el1(SYS_ELR);
>>> ctxt_sys_reg(ctxt, SPSR_EL1) = read_sysreg_el1(SYS_SPSR);
>>> @@ -107,6 +120,19 @@ static inline void
>>> __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
>>> write_sysreg(ctxt_sys_reg(ctxt, PAR_EL1), par_el1);
>>> write_sysreg(ctxt_sys_reg(ctxt, TPIDR_EL1), tpidr_el1);
>>>
>>> + if (!has_vhe()) {
>>> + u64 reg = read_sysreg(cptr_el2);
>>> + if (reg & CPTR_EL2_TZ) {
>>> + write_sysreg(reg & ~CPTR_EL2_TZ, cptr_el2);
>>> + write_sysreg_s(ctxt_sys_reg(ctxt, ZCR_EL1), SYS_ZCR_EL1);
>>> + write_sysreg_s(ctxt_sys_reg(ctxt, ZCR_EL2), SYS_ZCR_EL2);
>>> + write_sysreg(reg, cptr_el2);
>>
>> Same thing here. If TZ is set, there is no point in restoring these
>> registers,
>> and it should be delegated to the point where you actually restore the bulk
>> of the SVE state.
>
> Ack
>
>>
>>> + } else {
>>> + write_sysreg_s(ctxt_sys_reg(ctxt, ZCR_EL1), SYS_ZCR_EL1);
>>> + write_sysreg_s(ctxt_sys_reg(ctxt, ZCR_EL2), SYS_ZCR_EL2);
>>> + }
>>> + }
>>> +
>>> if (!has_vhe() &&
>>> cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT) &&
>>> ctxt->__hyp_running_vcpu) {
>>> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c
>>> b/arch/arm64/kvm/hyp/nvhe/switch.c
>>> index 8ae8160bc93a..bffd78203448 100644
>>> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
>>> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
>>> @@ -41,7 +41,11 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
>>>
>>> val = CPTR_EL2_DEFAULT;
>>> val |= CPTR_EL2_TTA | CPTR_EL2_TZ | CPTR_EL2_TAM;
>>> - if (!update_fp_enabled(vcpu)) {
>>> +
>>> + if (update_fp_enabled(vcpu)) {
>>> + if (vcpu_has_sve(vcpu))
>>> + val &= ~CPTR_EL2_TZ;
>>> + } else {
>>
>> nit: please don't invert conditions like this.
Done, just wanted to have the same flow as in vhe code.
>>
>>> val |= CPTR_EL2_TFP;
>>> __activate_traps_fpsimd32(vcpu);
>>> }
>>> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
>>> index f32490229a4c..3c62fb10a3fe 100644
>>> --- a/arch/arm64/kvm/reset.c
>>> +++ b/arch/arm64/kvm/reset.c
>>> @@ -127,10 +127,6 @@ static int kvm_vcpu_enable_sve(struct kvm_vcpu *vcpu)
>>> if (!system_supports_sve())
>>> return -EINVAL;
>>>
>>> - /* Verify that KVM startup enforced this when SVE was detected: */
>>> - if (WARN_ON(!has_vhe()))
>>> - return -EINVAL;
>>> -
>>> vcpu->arch.sve_max_vl = kvm_sve_max_vl;
>>>
>>> /*
>>> @@ -168,6 +164,10 @@ static int kvm_vcpu_finalize_sve(struct kvm_vcpu
>>> *vcpu)
>>> return -ENOMEM;
>>>
>>> vcpu->arch.sve_state = buf;
>>> +
>>> + if (!has_vhe())
>>> + __vcpu_sys_reg(vcpu, ZCR_EL2) = sve_vq_from_vl(vl) - 1;
>>
>> As I understand it, this is the only time where ZCR_EL2 is ever set
>> for a given guest. It only has to be restored on guest entry, never
>> saved back.
>
> [...]
>>
>> Thanks,
>>
>> M.
>
> (Except that the hyp code may need to mess with ZCR_EL2 as outlined
> above. But ack that the value that needs to be in ZCR_EL2 while the
> guest runs should never change from the value determined here IIUC.)
>
> Cheers
> ---Dave
More information about the linux-arm-kernel
mailing list