[PATCH 5/5] KVM: arm64: uapi: Add kvm_debug_exit_arch.hsr_high

Marc Zyngier maz at kernel.org
Fri Apr 8 00:47:00 PDT 2022


Hi Alex,

On Thu, 07 Apr 2022 17:23:27 +0100,
Alexandru Elisei <alexandru.elisei at arm.com> wrote:
> 
> When userspace is debugging a VM, the kvm_debug_exit_arch part of the
> kvm_run struct contains arm64 specific debug information: the ESR_EL2
> value, encoded in the field "hsr", and the address of the instruction
> that caused the exception, encoded in the field "far".
> 
> Linux has moved to treating ESR_EL2 as a 64-bit register, but unfortunately
> kvm_debug_exit_arch.hsr cannot be changed to match because that would
> change the memory layout of the struct on big endian machines:
> 
> Current layout:			| Layout with "hsr" extended to 64 bits:
> 				|
> offset 0: ESR_EL2[31:0] (hsr)   | offset 0: ESR_EL2[61:32] (hsr[61:32])
> offset 4: padding		| offset 4: ESR_EL2[31:0]  (hsr[31:0])
> offset 8: FAR_EL2[61:0] (far)	| offset 8: FAR_EL2[61:0]  (far)
> 
> which breaks existing code.
> 
> The padding is inserted by the compiler because the "far" field must be
> aligned to 8 bytes (each field must be naturally aligned - aapcs64 [1],
> page 18) and the struct itself must be aligned to 8 bytes (the struct must
> be aligned to the maximum alignment of its fields - aapcs64, page 18),
> which means that "hsr" must be aligned to 8 bytes as it is the first field
> in the struct.
> 
> To avoid changing the struct size and layout for the existing fields, add a
> new field, "hsr_high", which replaces the existing padding. "hsr_high" will
> be used to hold the ESR_EL2[61:32] bits of the register. The memory layout,
> both on big and little endian machine, becomes:
> 
> Layout with "hsr_high" added:
> 
> offset 0: ESR_EL2[31:0]  (hsr)
> offset 4: ESR_EL2[61:32] (hsr_high)
> offset 8: FAR_EL2[61:0]  (far)

My concern with this change is that it isn't clear what the padding is
currently initialised to, and I don't think there is any guarantee
that it is zeroed. With that, a new userspace on an old kernel would
interpret hsr_high, and potentially observe stuff that wasn't supposed
to be interpreted.

That's yet another mistake in our userspace ABI (where is the time
machine when you need it?).

In order to do this, we must advertise to userspace that we provide
more information. This probably means adding a flag of some sort to
kvm_run (there are at least 128 bits of x86 stuff that can be readily
reclaimed).

What do you think?

	M.

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list