[PATCH 07/16] KVM: arm64: timers: Allow physical offset without CNTPOFF_EL2
Marc Zyngier
maz at kernel.org
Fri Feb 24 02:54:52 PST 2023
On Thu, 23 Feb 2023 22:40:25 +0000,
Colton Lewis <coltonlewis at google.com> wrote:
>
>
> Marc Zyngier <maz at kernel.org> writes:
>
> > +/* If _pred is true, set bit in _set, otherwise set it in _clr */
> > +#define assign_clear_set_bit(_pred, _bit, _clr, _set) \
> > + do { \
> > + if (_pred) \
> > + (_set) |= (_bit); \
> > + else \
> > + (_clr) |= (_bit); \
> > + } while (0)
> > +
>
> I don't think the do-while wrapper is necessary. Is there any reason
> besides style guide conformance?
It is if you want to avoid a stray ';'.
> > + /*
> > + * We have two possibility to deal with a physical offset:
> > + *
> > + * - Either we have CNTPOFF (yay!) or the offset is 0:
> > + * we let the guest freely access the HW
> > + *
> > + * - or neither of these condition apply:
> > + * we trap accesses to the HW, but still use it
> > + * after correcting the physical offset
> > + */
> > + if (!has_cntpoff() && timer_get_offset(map->direct_ptimer))
> > + tpt = tpc = true;
>
> If there are only two possibilites, then two different booleans makes
> things more complicated than it has to be.
Each boolean denotes a different architectural state. They are
separate so that someone can:
- easily understand what is going on
- affect one without affecting the other when extending this code
The "common state" is what we had before, and it was a real pig to
reverse engineer *my own code*. Yes, this is job security, but I don't
think that's a good enough reason! ;-)
So I contend that two bools make things far simpler to reason about
these things.
>
> > + assign_clear_set_bit(tpt, CNTHCTL_EL1PCEN << 10, set, clr);
> > + assign_clear_set_bit(tpc, CNTHCTL_EL1PCTEN << 10, set, clr);
>
> Might be good to name the 10 something like VHE_SHIFT so people know why
> it is applied.
VHE_SHIFT really doesn't mean more that '10' because it doesn't tell
you *why* you have to do this.
The real way of solving that one is it move everything to the sysreg
generation *and* have a way to contextualise the sysreg generation
based on features and other controls (see the discussion about
FEAT_CCIDX as an example).
>
> > +
> > +
> > + timer_set_traps(vcpu, &map);
> > }
>
> > bool kvm_timer_should_notify_user(struct kvm_vcpu *vcpu)
> > @@ -1293,27 +1363,12 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
> > }
>
> > /*
> > - * On VHE system, we only need to configure the EL2 timer trap
> > register once,
> > - * not for every world switch.
> > - * The host kernel runs at EL2 with HCR_EL2.TGE == 1,
> > - * and this makes those bits have no effect for the host kernel
> > execution.
> > + * If we have CNTPOFF, permanently set ECV to enable it.
> > */
> > void kvm_timer_init_vhe(void)
> > {
> > - /* When HCR_EL2.E2H ==1, EL1PCEN and EL1PCTEN are shifted by 10 */
> > - u32 cnthctl_shift = 10;
> > - u64 val;
> > -
> > - /*
> > - * VHE systems allow the guest direct access to the EL1 physical
> > - * timer/counter.
> > - */
> > - val = read_sysreg(cnthctl_el2);
> > - val |= (CNTHCTL_EL1PCEN << cnthctl_shift);
> > - val |= (CNTHCTL_EL1PCTEN << cnthctl_shift);
> > if (cpus_have_final_cap(ARM64_HAS_ECV_CNTPOFF))
> > - val |= CNTHCTL_ECV;
> > - write_sysreg(val, cnthctl_el2);
> > + sysreg_clear_set(cntkctl_el1, 0, CNTHCTL_ECV);
> > }
>
> What is the reason for moving these register writes from initialization
> to vcpu load time? This contradicts the comment that says this is only
> needed once and not at every world switch. Seems like doing more work
> for no reason.
You did notice that the comment got *removed*, so that there is no
contradiction?
You also understand that with a physical offset, and in the absence of
CNTPOFF, we cannot grant access to the physical counter/timer to the
guest?
Finally, given that we always have to write various bits of
CNTKCTL_EL1 for other reasons, moving this settings shouldn't result
in any extra work (specially considering that they don't require any
extra synchronisation).
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list