[kvmtool PATCH 2/2] aarch64: Add support for MTE

Vladimir Murzin vladimir.murzin at arm.com
Wed Mar 23 06:57:19 PDT 2022


Hi,

On 3/23/22 12:03 PM, Alexandru Elisei wrote:
> Hi,
> 
> On Wed, Mar 23, 2022 at 10:31:15AM +0000, Vladimir Murzin wrote:
>> On 3/21/22 5:08 PM, Alexandru Elisei wrote:
>>> Hi,
>>>
>>> On Mon, Mar 21, 2022 at 03:40:18PM +0000, Vladimir Murzin wrote:
>>>> Hi Alexandru,
>>>>
>>>> On 3/21/22 3:28 PM, Alexandru Elisei wrote:
>>>>> MTE has been supported in Linux since commit 673638f434ee ("KVM: arm64:
>>>>> Expose KVM_ARM_CAP_MTE"), add support for it in kvmtool.
>>>>>
>>>>> Signed-off-by: Alexandru Elisei <alexandru.elisei at arm.com>
>>>>> ---
>>>>>    arm/aarch32/include/kvm/kvm-arch.h        |  3 +++
>>>>>    arm/aarch64/include/kvm/kvm-arch.h        |  1 +
>>>>>    arm/aarch64/include/kvm/kvm-config-arch.h |  2 ++
>>>>>    arm/aarch64/kvm.c                         | 13 +++++++++++++
>>>>>    arm/include/arm-common/kvm-config-arch.h  |  1 +
>>>>>    arm/kvm.c                                 |  3 +++
>>>>>    6 files changed, 23 insertions(+)
>>>>>
>>>>> diff --git a/arm/aarch32/include/kvm/kvm-arch.h b/arm/aarch32/include/kvm/kvm-arch.h
>>>>> index bee2fc255a82..5616b27e257e 100644
>>>>> --- a/arm/aarch32/include/kvm/kvm-arch.h
>>>>> +++ b/arm/aarch32/include/kvm/kvm-arch.h
>>>>> @@ -5,6 +5,9 @@
>>>>>    #define kvm__arch_get_kern_offset(...)	0x8000
>>>>> +struct kvm;
>>>>> +static inline void kvm__arch_enable_mte(struct kvm *kvm) {}
>>>>> +
>>>>>    #define ARM_MAX_MEMORY(...)	ARM_LOMAP_MAX_MEMORY
>>>>>    #define MAX_PAGE_SIZE	SZ_4K
>>>>> diff --git a/arm/aarch64/include/kvm/kvm-arch.h b/arm/aarch64/include/kvm/kvm-arch.h
>>>>> index 5e5ee41211ed..9124f6919d0f 100644
>>>>> --- a/arm/aarch64/include/kvm/kvm-arch.h
>>>>> +++ b/arm/aarch64/include/kvm/kvm-arch.h
>>>>> @@ -6,6 +6,7 @@
>>>>>    struct kvm;
>>>>>    unsigned long long kvm__arch_get_kern_offset(struct kvm *kvm, int fd);
>>>>>    int kvm__arch_get_ipa_limit(struct kvm *kvm);
>>>>> +void kvm__arch_enable_mte(struct kvm *kvm);
>>>>>    #define ARM_MAX_MEMORY(kvm)	({					\
>>>>>    	u64 max_ram;							\
>>>>> diff --git a/arm/aarch64/include/kvm/kvm-config-arch.h b/arm/aarch64/include/kvm/kvm-config-arch.h
>>>>> index 04be43dfa9b2..11250365d8d5 100644
>>>>> --- a/arm/aarch64/include/kvm/kvm-config-arch.h
>>>>> +++ b/arm/aarch64/include/kvm/kvm-config-arch.h
>>>>> @@ -6,6 +6,8 @@
>>>>>    			"Run AArch32 guest"),				\
>>>>>    	OPT_BOOLEAN('\0', "pmu", &(cfg)->has_pmuv3,			\
>>>>>    			"Create PMUv3 device"),				\
>>>>> +	OPT_BOOLEAN('\0', "mte", &(cfg)->has_mte,			\
>>>>> +			"Enable memory tagging extension"),		\
>>>>>    	OPT_U64('\0', "kaslr-seed", &(cfg)->kaslr_seed,			\
>>>>>    			"Specify random seed for Kernel Address Space "	\
>>>>>    			"Layout Randomization (KASLR)"),
>>>>> diff --git a/arm/aarch64/kvm.c b/arm/aarch64/kvm.c
>>>>> index 56a0aedc263d..46548f8ee96e 100644
>>>>> --- a/arm/aarch64/kvm.c
>>>>> +++ b/arm/aarch64/kvm.c
>>>>> @@ -81,3 +81,16 @@ int kvm__get_vm_type(struct kvm *kvm)
>>>>>    	return KVM_VM_TYPE_ARM_IPA_SIZE(ipa_bits);
>>>>>    }
>>>>> +
>>>>> +void kvm__arch_enable_mte(struct kvm *kvm)
>>>>> +{
>>>>> +	struct kvm_enable_cap cap = {
>>>>> +		.cap = KVM_CAP_ARM_MTE,
>>>>> +	};
>>>>> +
>>>>> +	if (!kvm__supports_extension(kvm, KVM_CAP_ARM_MTE))
>>>>> +		die("MTE capability is not supported");
>>>>> +
>>>>> +	if (ioctl(kvm->vm_fd, KVM_ENABLE_CAP, &cap))
>>>>> +		die_perror("KVM_ENABLE_CAP(KVM_CAP_ARM_MTE)");
>>>>> +}
>>>>> diff --git a/arm/include/arm-common/kvm-config-arch.h b/arm/include/arm-common/kvm-config-arch.h
>>>>> index 5734c46ab9e6..16e8d500a71b 100644
>>>>> --- a/arm/include/arm-common/kvm-config-arch.h
>>>>> +++ b/arm/include/arm-common/kvm-config-arch.h
>>>>> @@ -9,6 +9,7 @@ struct kvm_config_arch {
>>>>>    	bool		virtio_trans_pci;
>>>>>    	bool		aarch32_guest;
>>>>>    	bool		has_pmuv3;
>>>>> +	bool		has_mte;
>>>>>    	u64		kaslr_seed;
>>>>>    	enum irqchip_type irqchip;
>>>>>    	u64		fw_addr;
>>>>> diff --git a/arm/kvm.c b/arm/kvm.c
>>>>> index 80d233f13d0b..f2db93953778 100644
>>>>> --- a/arm/kvm.c
>>>>> +++ b/arm/kvm.c
>>>>> @@ -86,6 +86,9 @@ void kvm__arch_init(struct kvm *kvm, const char *hugetlbfs_path, u64 ram_size)
>>>>>    	/* Create the virtual GIC. */
>>>>>    	if (gic__create(kvm, kvm->cfg.arch.irqchip))
>>>>>    		die("Failed to create virtual GIC");
>>>>> +
>>>>> +	if (kvm->cfg.arch.has_mte)
>>>>> +		kvm__arch_enable_mte(kvm);
>>>>>    }
>>>>
>>>> Can we enable it unconditionally if KVM_CAP_ARM_MTE is supported like we do for
>>>> PAC and SVE?
>>>
>>> I thought about that, the reason I chose to enable it based a kvmtool
>>> command line option, instead of always being enabled if available, is
>>> because of the overhead of sanitising the MTE tags on each stage 2 data
>>> abort. Steven, am I overreacting and that overhead is negligible?
>>>
>>> Also, as far as I know, PAC and SVE incur basically no overhead in KVM
>>> until the guest starts to use those features.
>>>
>>> Do you have a specific reason for wanting MTE to always be enabled if
>>> available? I'm happy to be convinced to make MTE enabled by default, I
>>> don't have preference either way.
>>
>> Well, automatically enabling if available would align with what we do for
>> other features in kvmtool and Linux itself - we tend to default y for new
>> features, even MTE, thus improving chances to get reports back early if
>> something (even performance) goes wrong. Just my 2p.
> 
> According to Steven, for each 4k page the kernel uses an 128 byte buffer to
> store the tags, and then some extra memory is used to keep track of the
> buffers. Let's take the case of a VM with 1GB of memory, and be
> conservative and only account for the tag buffer. In this case, the tag
> buffers alone will be 32MB.

Right, IIUC, that memory is allocated on demand when we about to swap the page
out or perform hibernation, so there is no reservation done upfront or I'm
missing something?

> 
> For a VM with 1GB of memory created with kvmtool built from current master
> (commit faae833a746f), pmap shows a total memory usage of 1268388K.
> Subtracting the memory of the VM, we are left with 214MB of memory consumed
> by kvmtool. Having MTE enabled would increase the memory overhead of
> kvmtool by 32/214*100 = 15%.
> 

I admit, I might be missing something, but given that extra tag storage
allocated for swap/hibernation overhead can be applied to any process which
uses MTE, no?

> Of course, this memory overhead scales with the amount of memory the VM
> has. The buffer size that KVM uses might change in the future, but since we
> cannot predict the future (might become larger or smaller), I'm working
> with what is implement today.
>

Fair enough. IMO, we will see more MTE capable hardware, OTOH, memory overhead
won't magically disappear, so should we start thinking how to reduce it?
I noticed that code saves tags one to one, so for guest which doesn't
actively use MTE whole page would be tagged with zero, cannot that case be
optimized? or maybe be go further and compress tags?
  
> The kernel documentation for MTE suggests that in order to take advantage
> of it, software must be modified and recompiled. That means that users that
> don't want to use MTE won't get to exercise MTE because they won't be using
> MTE enabled software, but they will pay the overhead regardless. This of
> course assumes that going forward software won't be using MTE by default.
> 
> kvmtool is supposed to be simple, fast and less resource intensive than
> other hypervisors, that's why I think having MTE disabled by default is the
> best way to implement the capability.

I see kvmtool as bleeding edge hacking tool, so closer it to the edge is better :)

Cheers
Vladimir

> 
> Thanks,
> Alex




More information about the linux-arm-kernel mailing list