[PATCH v7 3/7] KVM: arm64: Allow userspace to configure a vCPU's virtual offset

Mon Sep 6 02:02:09 PDT 2021

On Sun, 29 Aug 2021 03:35:30 +0100,
Oliver Upton <oupton at google.com> wrote:
> 
> On Mon, Aug 16, 2021 at 12:12:13AM +0000, Oliver Upton wrote:
> > Allow userspace to access the guest's virtual counter-timer offset
> > through the ONE_REG interface. The value read or written is defined to
> > be an offset from the guest's physical counter-timer. Add some
> > documentation to clarify how a VMM should use this and the existing
> > CNTVCT_EL0.
> > 
> > Signed-off-by: Oliver Upton <oupton at google.com>
> > Reviewed-by: Andrew Jones <drjones at redhat.com>
> 
> Hrm...
> 
> I was mulling on this patch a bit more and had a thought. As previously
> discussed, the patch implements virtual offsets by broadcasting the same
> offset to all vCPUs in a guest. I wonder if this may tolerate bad
> practices from userspace that will break when KVM supports NV.
> 
> Consider that a nested guest may set CNTVOFF_EL2 to whatever value it
> wants. Presumably, we will need to patch the handling of CNTVOFF_EL2 to
> *not* broadcast to all vCPUs to save/restore NV properly. If a maligned
> VMM only wrote to a single vCPU, banking on this broadcasting
> implementation, it will fall flat on its face when handling an NV guest.
> 
> So, should we preemptively move to the new way of the world, wherein
> userspace accesses to CNTVOFF_EL2 are vCPU-local rather than VM-wide?
> 
> No strong opinions in either direction, but figured I'd address it since
> I'll need to respin this series anyway to fix ECV+nVHE.

Thought about this a bit more whilst being away from a computer...

It all boils down to what we expose as an abstraction of a machine. If
there is no EL2 in the VM, then there shouldn't be any way for the
guest to observe different values for the counters as seen from
different vcpus. That's what the architecture guarantees for a
physical system, and we shouldn't deviate from that. Opening the door
for userspace to do anything differently is a recipe for disaster.

It actually is an argument in favour of setting the various offsets to
a value that keep the two physical and virtual counters in sync,
instead of the current behaviour that allows different values to be
observed.

The above is in contrast with what the architecture allows when EL2 is
present. The hypervisor can naturally deal with the offsets as it sees
fit, and no offset should have any bearing on it (this later point is
of course to be moderated by CNTPOFF on the host).

To sum it up, I'd rather keep the CNTVOFF behaviour what it is today
for guest that have their highest exception level at EL1. For EL2
guests, the setting will obviously have to become per-CPU.

	M.

-- 
Without deviation from the norm, progress is not possible.