[PATCH v2 14/26] KVM: arm64: nv: Add trap forwarding infrastructure

Marc Zyngier maz at kernel.org
Sat Jul 29 02:19:17 PDT 2023


On Fri, 28 Jul 2023 19:33:31 +0100,
Oliver Upton <oliver.upton at linux.dev> wrote:
> 
> On Fri, Jul 28, 2023 at 09:29:40AM +0100, Marc Zyngier wrote:
> 
> [...]
> 
> > +/*
> > + * Bit assignment for the trap controls. We use a 64bit word with the
> > + * following layout for each trapped sysreg:
> > + *
> > + * [9:0]	enum trap_group (10 bits)
> > + * [13:10]	enum fgt_group_id (4 bits)
> > + * [19:14]	bit number in the FGT register (6 bits)
> > + * [20]		trap polarity (1 bit)
> > + * [62:21]	Unused (42 bits)
> > + * [63]		RES0 - Must be zero, as lost on insertion in the xarray
> > + */
> > +union trap_config {
> > +	u64	val;
> > +	struct {
> > +		unsigned long	cgt:10;	/* Coarse trap id */
> > +		unsigned long	fgt:4;	/* Fing Grained Trap id */
> > +		unsigned long	bit:6;	/* Bit number */
> > +		unsigned long	pol:1;	/* Polarity */
> > +		unsigned long	unk:42;	/* Unknown */
> > +		unsigned long	mbz:1;	/* Must Be Zero */
> > +	};
> > +};
> 
> Correct me if I'm wrong, but I don't think the compiler is going to
> whine if any of these bitfields are initialized with a larger value than
> can be represented... Do you think some BUILD_BUG_ON() is in order to
> ensure that trap_group fits in ::cgt?
> 
> 	BUILD_BUG_ON(__NR_TRAP_IDS__ >= BIT(10));

Indeed. This might also apply to ::fgt, and I want to add some sanity
checks to verify that the whole union isn't larger than a 'void *', as
we rely on that.

> 
> > +struct encoding_to_trap_config {
> > +	const u32			encoding;
> > +	const u32			end;
> > +	const union trap_config		tc;
> > +};
> > +
> > +#define SR_RANGE_TRAP(sr_start, sr_end, trap_id)			\
> > +	{								\
> > +		.encoding	= sr_start,				\
> > +		.end		= sr_end,				\
> > +		.tc		= {					\
> > +			.cgt		= trap_id,			\
> > +		},							\
> > +	}
> > +
> > +#define SR_TRAP(sr, trap_id)		SR_RANGE_TRAP(sr, sr, trap_id)
> > +
> > +/*
> > + * Map encoding to trap bits for exception reported with EC=0x18.
> > + * These must only be evaluated when running a nested hypervisor, but
> > + * that the current context is not a hypervisor context. When the
> > + * trapped access matches one of the trap controls, the exception is
> > + * re-injected in the nested hypervisor.
> > + */
> > +static const struct encoding_to_trap_config encoding_to_cgt[] __initconst = {
> > +};
> > +
> > +static DEFINE_XARRAY(sr_forward_xa);
> > +
> > +static union trap_config get_trap_config(u32 sysreg)
> > +{
> > +	return (union trap_config) {
> > +		.val = xa_to_value(xa_load(&sr_forward_xa, sysreg)),
> > +	};
> 
> Should we be checking for NULL here? AFAICT, the use of sentinel values
> in the trap_group enum would effectively guarantee each trap_config has
> a nonzero value.

if xa_load() returns NULL, xa_to_value() will still give us a 0, which
is an indication of a sysreg that isn't present in the trap
configuration. This can happen if we trap something that isn't yet
supported in NV, which is quite common. This allows us to use features
on the host without having to immediately write the same support for
NV guests.

But this is obviously a temporary situation. At some point, I'll
become a total bastard and demand that people treat NV as a first
class citizen. One day ;-).

> 
> > +}
> > +
> > +void __init populate_nv_trap_config(void)
> > +{
> > +	for (int i = 0; i < ARRAY_SIZE(encoding_to_cgt); i++) {
> > +		const struct encoding_to_trap_config *cgt = &encoding_to_cgt[i];
> > +		void *prev;
> > +
> > +		prev = xa_store_range(&sr_forward_xa, cgt->encoding, cgt->end,
> > +				      xa_mk_value(cgt->tc.val), GFP_KERNEL);
> > +		WARN_ON(prev);
> 
> Returning the error here and failing the overall KVM initialization
> seems to be the safest option. The WARN is still handy, though.

Yeah, this has found a number of bugs initially. Happy to fail the
initialisation after we've iterated over all the array (nothing is
more annoying than fixing a bunch of errors iteratively).

> 
> > +	}
> > +
> > +	kvm_info("nv: %ld coarse grained trap handlers\n",
> > +		 ARRAY_SIZE(encoding_to_cgt));
> > +
> > +}
> > +
> > +static enum trap_behaviour get_behaviour(struct kvm_vcpu *vcpu,
> > +					 const struct trap_bits *tb)
> > +{
> > +	enum trap_behaviour b = BEHAVE_HANDLE_LOCALLY;
> > +	u64 val;
> > +
> > +	val = __vcpu_sys_reg(vcpu, tb->index);
> > +	if ((val & tb->mask) == tb->value)
> > +		b |= tb->behaviour;
> > +
> > +	return b;
> > +}
> > +
> > +static enum trap_behaviour __do_compute_trap_behaviour(struct kvm_vcpu *vcpu,
> > +						       const enum trap_group id,
> > +						       enum trap_behaviour b)
> > +{
> > +	switch (id) {
> > +		const enum trap_group *cgids;
> > +
> > +	case __RESERVED__ ... __MULTIPLE_CONTROL_BITS__ - 1:
> > +		if (likely(id != __RESERVED__))
> > +			b |= get_behaviour(vcpu, &coarse_trap_bits[id]);
> > +		break;
> > +	case __MULTIPLE_CONTROL_BITS__ ... __COMPLEX_CONDITIONS__ - 1:
> > +		/* Yes, this is recursive. Don't do anything stupid. */
> > +		cgids = coarse_control_combo[id - __MULTIPLE_CONTROL_BITS__];
> > +		for (int i = 0; cgids[i] != __RESERVED__; i++)
> > +			b |= __do_compute_trap_behaviour(vcpu, cgids[i], b);
> 
> Would it make sense to WARN here if one of the child trap ids was in the
> recursive range?

This might be needed at some point, but we can probably tighten it for
now.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list