[PATCH v2 0/2] MTE support for KVM guest

Thu Sep 10 06:24:51 EDT 2020

On 10/09/2020 01:33, Richard Henderson wrote:
> On 9/4/20 9:00 AM, Steven Price wrote:
>>   3. Doesn't provide any new methods for the VMM to access the tags on
>>      memory.
> ...
>> (3) may be problematic and I'd welcome input from those familiar with
>> VMMs. User space cannot access tags unless the memory is mapped with the
>> PROT_MTE flag. However enabling PROT_MTE will also enable tag checking
>> for the user space process (assuming the VMM enables tag checking for
>> the process)...
> 
> The latest version of the kernel patches for user mte support has separate
> controls for how tag check fail is reported.  Including
> 
>> +- ``PR_MTE_TCF_NONE``  - *Ignore* tag check faults
> 
> That may be less than optimal once userland starts uses tags itself, e.g.
> running qemu itself with an mte-aware malloc.
> 
> Independent of that, there's also the TCO bit, which can be toggled by any
> piece of code that wants to disable checking locally.

Yes, I would expect the TCO bit is the best option for wrapping accesses 
to make them unchecked.

> However, none of that is required for accessing tags.  User space can always
> load/store tags via LDG/STG.  That's going to be slow, though.

Yes as things stand LDG/STG is the way for user space to access tags. 
Since I don't have any real hardware I can't really comment on speed.

> It's a shame that LDGM/STGM are privileged instructions.  I don't understand
> why that was done, since there's absolutely nothing that those insns can do
> that you can't do with (up to) 16x LDG/STG.

It is a shame, however I suspect this is because to use those 
instructions you need to know the block size held in GMID_EL1. And at 
least in theory that could vary between CPUs.

> I think it might be worth adding some sort of kernel entry point that can bulk
> copy tags, e.g. page aligned quantities.  But that's just a speed of migration
> thing and could come later.

When we have some real hardware it would be worth profiling this. At the 
moment I've no idea whether the kernel entry overhead would make such an 
interface useful from a performance perspective or not.

Steve