[PATCH v4] KVM: riscv: Skip CSR restore if VCPU is reloaded on the same core

Andrew Jones andrew.jones at oss.qualcomm.com
Tue Feb 24 13:50:54 PST 2026


On Sun, Feb 22, 2026 at 12:57:41PM +0800, Jinyu Tang wrote:
> Currently, kvm_arch_vcpu_load() unconditionally restores guest CSRs and
> HGATP. However, when a preempted VCPU is scheduled back on the same
> physical CPU, and no other KVM VCPU has run on this CPU in the meantime,
> the hardware CSRs are still valid.
> 
> This patch optimizes the vcpu_load path by skipping the expensive CSR
> writes if all the following conditions are met:
> 1. The VCPU was previously preempted (vcpu->scheduled_out == 1).
> 2. It is being reloaded on the same CPU (vcpu->arch.last_exit_cpu == cpu).
> 3. No other VCPU used this CPU (vcpu == __this_cpu_read(kvm_former_vcpu)).
> 4. The CSRs are not dirty (!vcpu->arch.csr_dirty).
> 
> To ensure this fast-path doesn't break corner cases:
> - Live migration and VCPU reset are naturally safe. KVM initializes
>   last_exit_cpu to -1, which guarantees the fast-path won't trigger.
> - A new 'csr_dirty' flag is introduced to track runtime userspace
>   interventions. If userspace modifies guest configurations (e.g.,
>   hedeleg via KVM_SET_GUEST_DEBUG, or CSRs via KVM_SET_ONE_REG) while
>   the VCPU is preempted, the flag is set to skip fast path.

I'm a bit concerned that the need for csr_dirty makes this fragile. Maybe
some reorganizing/factoring of kvm_arch_vcpu_load() could be done in a way
to make it obvious to anybody that's adding another csr to be loaded that
they also need to ensure csr_dirty is set when appropriate? At least a big
comment would help.

That said, can we now apply csr_dirty everywhere CSRs are dirtied allowing
us to remove the vcpu->scheduled_out check? Unless I'm missing something
we should only need to ensure last-cpu and last-vcpu are as expected and
that no csrs are dirty in order to skip the writes - it shouldn't matter
that we're reloading after a sched out or otherwise.

A couple nits below.

> 
> Note that kvm_riscv_vcpu_aia_load() is kept outside the skip logic
> to ensure IMSIC/AIA interrupt states are always properly
> synchronized.
> 
> Signed-off-by: Jinyu Tang <tjytimi at 163.com>
> ---
>  v3 -> v4:
>  - Addressed Anup Patel's review regarding hardware state inconsistency.
>  - Introduced 'csr_dirty' flag to track dynamic userspace CSR/CONFIG
>    modifications (KVM_SET_ONE_REG, KVM_SET_GUEST_DEBUG), forcing a full
>    restore when debugging or modifying states at userspace.
>  - Kept kvm_riscv_vcpu_aia_load() out of the skip block to resolve IMSIC
>    VS-file instability.
> 
>  v2 -> v3:
>  v2 was missing a critical check because I generated the patch from my
>  wrong (experimental) branch. This is fixed in v3. Sorry for my trouble.
> 
>  v1 -> v2:
>  Apply the logic to aia csr load. Thanks for Andrew Jones's advice.
> ---
>  arch/riscv/include/asm/kvm_host.h |  3 +++
>  arch/riscv/kvm/vcpu.c             | 13 +++++++++++++
>  arch/riscv/kvm/vcpu_onereg.c      |  3 +++
>  3 files changed, 19 insertions(+)
> 
> diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
> index 24585304c..7ee47b83c 100644
> --- a/arch/riscv/include/asm/kvm_host.h
> +++ b/arch/riscv/include/asm/kvm_host.h
> @@ -273,6 +273,9 @@ struct kvm_vcpu_arch {
>  	/* 'static' configurations which are set only once */
>  	struct kvm_vcpu_config cfg;
>  
> +	/* Indicates modified guest CSRs */
> +	bool csr_dirty;
> +
>  	/* SBI steal-time accounting */
>  	struct {
>  		gpa_t shmem;
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> index a55a95da5..f7f58f02c 100644
> --- a/arch/riscv/kvm/vcpu.c
> +++ b/arch/riscv/kvm/vcpu.c
> @@ -24,6 +24,8 @@
>  #define CREATE_TRACE_POINTS
>  #include "trace.h"
>  
> +static DEFINE_PER_CPU(struct kvm_vcpu *, kvm_former_vcpu);
> +
>  const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
>  	KVM_GENERIC_VCPU_STATS(),
>  	STATS_DESC_COUNTER(VCPU, ecall_exit_stat),
> @@ -537,6 +539,9 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>  		vcpu->arch.cfg.hedeleg |= BIT(EXC_BREAKPOINT);
>  	}
>  
> +	/* Mark CSRs dirty on hedeleg update */

Unnecessary comment.

> +	vcpu->arch.csr_dirty = true;
> +
>  	return 0;
>  }
>  
> @@ -581,6 +586,11 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  	struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr;
>  	struct kvm_vcpu_config *cfg = &vcpu->arch.cfg;
>  
> +	if (vcpu->scheduled_out && vcpu == __this_cpu_read(kvm_former_vcpu) &&
> +		vcpu->arch.last_exit_cpu == cpu && !vcpu->arch.csr_dirty)

Align the second line condition under the first

	if (vcpu->scheduled_out && vcpu == __this_cpu_read(kvm_former_vcpu) &&
	    vcpu->arch.last_exit_cpu == cpu && !vcpu->arch.csr_dirty)

> +		goto csr_restore_done;
> +
> +	vcpu->arch.csr_dirty = false;
>  	if (kvm_riscv_nacl_sync_csr_available()) {
>  		nsh = nacl_shmem();
>  		nacl_csr_write(nsh, CSR_VSSTATUS, csr->vsstatus);
> @@ -624,6 +634,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  
>  	kvm_riscv_mmu_update_hgatp(vcpu);
>  
> +csr_restore_done:
>  	kvm_riscv_vcpu_timer_restore(vcpu);
>  
>  	kvm_riscv_vcpu_host_fp_save(&vcpu->arch.host_context);
> @@ -645,6 +656,8 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>  	void *nsh;
>  	struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr;
>  
> +	__this_cpu_write(kvm_former_vcpu, vcpu);
> +
>  	vcpu->cpu = -1;
>  
>  	kvm_riscv_vcpu_aia_put(vcpu);
> diff --git a/arch/riscv/kvm/vcpu_onereg.c b/arch/riscv/kvm/vcpu_onereg.c
> index e7ab6cb00..88cfcb018 100644
> --- a/arch/riscv/kvm/vcpu_onereg.c
> +++ b/arch/riscv/kvm/vcpu_onereg.c
> @@ -652,6 +652,9 @@ static int kvm_riscv_vcpu_set_reg_csr(struct kvm_vcpu *vcpu,
>  	if (rc)
>  		return rc;
>  
> +	/* Mark CSRs dirty after userspace update csr */

Unnecessary comment.

> +	vcpu->arch.csr_dirty = true;
> +
>  	return 0;
>  }
>  
> -- 
> 2.43.0
>

Thanks,
drew



More information about the linux-riscv mailing list