[PATCH v2 4/5] KVM: arm64: Prevent host from managing timer offsets for protected VMs

Sun Nov 9 11:51:21 PST 2025

Hi Oliver,

On Fri, 7 Nov 2025 at 23:21, Oliver Upton <oupton at kernel.org> wrote:
>
> On Thu, Nov 06, 2025 at 02:44:16PM +0000, Fuad Tabba wrote:
> > For protected VMs, the guest's timer offset state is private and must
> > not be controlled by the host. Protected VMs must always run with a
> > virtual counter offset of 0.
> >
> > The existing timer logic allowed the host to set and manage the timer
> > counter offsets (voffset and poffset) for protected VMs.
> >
> > This patch disables all host-side management of timer offsets for
> > protected VMs by adding checks in the relevant code paths.
>
> "This patch ..." is generally discouraged in changelogs, just state what
> you're doing in an imperative tone.

Ack.

> > Signed-off-by: Fuad Tabba <tabba at google.com>
> > ---
> >  arch/arm64/kvm/arch_timer.c | 18 +++++++++++++-----
> >  arch/arm64/kvm/sys_regs.c   |  6 ++++--
> >  2 files changed, 17 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> > index 3f675875abea..69f5631ebf84 100644
> > --- a/arch/arm64/kvm/arch_timer.c
> > +++ b/arch/arm64/kvm/arch_timer.c
> > @@ -1056,10 +1056,14 @@ static void timer_context_init(struct kvm_vcpu *vcpu, int timerid)
> >
> >       ctxt->timer_id = timerid;
> >
> > -     if (timerid == TIMER_VTIMER)
> > -             ctxt->offset.vm_offset = &kvm->arch.timer_data.voffset;
> > -     else
> > -             ctxt->offset.vm_offset = &kvm->arch.timer_data.poffset;
> > +     if (!kvm_vm_is_protected(vcpu->kvm)) {
> > +             if (timerid == TIMER_VTIMER)
> > +                     ctxt->offset.vm_offset = &kvm->arch.timer_data.voffset;
> > +             else
> > +                     ctxt->offset.vm_offset = &kvm->arch.timer_data.poffset;
> > +     } else {
> > +             ctxt->offset.vm_offset = NULL;
> > +     }
> >
> >       hrtimer_setup(&ctxt->hrtimer, kvm_hrtimer_expire, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
> >
> > @@ -1083,7 +1087,8 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
> >               timer_context_init(vcpu, i);
> >
> >       /* Synchronize offsets across timers of a VM if not already provided */
> > -     if (!test_bit(KVM_ARCH_FLAG_VM_COUNTER_OFFSET, &vcpu->kvm->arch.flags)) {
> > +     if (!vcpu_is_protected(vcpu) &&
> > +         !test_bit(KVM_ARCH_FLAG_VM_COUNTER_OFFSET, &vcpu->kvm->arch.flags)) {
> >               timer_set_offset(vcpu_vtimer(vcpu), kvm_phys_timer_read());
> >               timer_set_offset(vcpu_ptimer(vcpu), 0);
> >       }
> > @@ -1687,6 +1692,9 @@ int kvm_vm_ioctl_set_counter_offset(struct kvm *kvm,
> >       if (offset->reserved)
> >               return -EINVAL;
> >
> > +     if (kvm_vm_is_protected(kvm))
> > +             return -EBUSY;
> > +
>
> This should be -EINVAL as pVMs do not even advertise the capability.
>
> Since we already have a generic helper for filtering KVM_CAPs, I'd
> prefer that we have a similar thing for enforcing ioctl limitations too.
>
> For example, you could maintain the ioctl => KVM_CAP mapping in a table
> and use kvm_pvm_ext_allowed() as the source of truth.

Yes, it makes more sense to consolidate these checks. Will do this
when I respin.

> >       mutex_lock(&kvm->lock);
> >
> >       if (!kvm_trylock_all_vcpus(kvm)) {
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index e67eb39ddc11..3329a8f03436 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -1606,11 +1606,13 @@ static int arch_timer_set_user(struct kvm_vcpu *vcpu,
> >               val &= ~ARCH_TIMER_CTRL_IT_STAT;
> >               break;
> >       case SYS_CNTVCT_EL0:
> > -             if (!test_bit(KVM_ARCH_FLAG_VM_COUNTER_OFFSET, &vcpu->kvm->arch.flags))
> > +             if (!vcpu_is_protected(vcpu) &&
> > +                 !test_bit(KVM_ARCH_FLAG_VM_COUNTER_OFFSET, &vcpu->kvm->arch.flags))
> >                       timer_set_offset(vcpu_vtimer(vcpu), kvm_phys_timer_read() - val);
> >               return 0;
> >       case SYS_CNTPCT_EL0:
> > -             if (!test_bit(KVM_ARCH_FLAG_VM_COUNTER_OFFSET, &vcpu->kvm->arch.flags))
> > +             if (!vcpu_is_protected(vcpu) &&
> > +                 !test_bit(KVM_ARCH_FLAG_VM_COUNTER_OFFSET, &vcpu->kvm->arch.flags))
> >                       timer_set_offset(vcpu_ptimer(vcpu), kvm_phys_timer_read() - val);
>
> Isn't there a general expectation that userspace not have access to the
> vCPU state of a pVM? That should be the mechanism of enforcement instead
> of special-casing these registers.

I thought I had a good reason for these checks, but now I cannot
remember what it was, nor can I see that there is a good reason, since
like you said, they are not accessible.

I'll remove them,

Cheers,
/fuad

> Thanks,
> Oliver