[PATCH v3 4/4] arm64: mte: Optimize mte_assign_mem_tag_range()

Vincenzo Frascino vincenzo.frascino at arm.com
Sat Jan 16 09:22:08 EST 2021


Hi Mark,

On 1/15/21 3:45 PM, Mark Rutland wrote:
> On Fri, Jan 15, 2021 at 12:00:43PM +0000, Vincenzo Frascino wrote:
>> mte_assign_mem_tag_range() is called on production KASAN HW hot
>> paths. It makes sense to optimize it in an attempt to reduce the
>> overhead.
>>
>> Optimize mte_assign_mem_tag_range() based on the indications provided at
>> [1].
> 
> ... what exactly is the optimization?
> 
> I /think/ you're just trying to have it inlined, but you should mention
> that explicitly.
> 

Good point, I will change it in the next version. I used "Optimize" as a
continuation of the topic in the previous thread but you are right it is not
immediately obvious.

>>
>> [1] https://lore.kernel.org/r/CAAeHK+wCO+J7D1_T89DG+jJrPLk3X9RsGFKxJGd0ZcUFjQT-9Q@mail.gmail.com/
>>
>> Cc: Catalin Marinas <catalin.marinas at arm.com>
>> Cc: Will Deacon <will at kernel.org>
>> Signed-off-by: Vincenzo Frascino <vincenzo.frascino at arm.com>
>> ---
>>  arch/arm64/include/asm/mte.h | 26 +++++++++++++++++++++++++-
>>  arch/arm64/lib/mte.S         | 15 ---------------
>>  2 files changed, 25 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
>> index 1a715963d909..9730f2b07b79 100644
>> --- a/arch/arm64/include/asm/mte.h
>> +++ b/arch/arm64/include/asm/mte.h
>> @@ -49,7 +49,31 @@ long get_mte_ctrl(struct task_struct *task);
>>  int mte_ptrace_copy_tags(struct task_struct *child, long request,
>>  			 unsigned long addr, unsigned long data);
>>  
>> -void mte_assign_mem_tag_range(void *addr, size_t size);
>> +static inline void mte_assign_mem_tag_range(void *addr, size_t size)
>> +{
>> +	u64 _addr = (u64)addr;
>> +	u64 _end = _addr + size;
>> +
>> +	/*
>> +	 * This function must be invoked from an MTE enabled context.
>> +	 *
>> +	 * Note: The address must be non-NULL and MTE_GRANULE_SIZE aligned and
>> +	 * size must be non-zero and MTE_GRANULE_SIZE aligned.
>> +	 */
>> +	do {
>> +		/*
>> +		 * 'asm volatile' is required to prevent the compiler to move
>> +		 * the statement outside of the loop.
>> +		 */
>> +		asm volatile(__MTE_PREAMBLE "stg %0, [%0]"
>> +			     :
>> +			     : "r" (_addr)
>> +			     : "memory");
>> +
>> +		_addr += MTE_GRANULE_SIZE;
>> +	} while (_addr < _end);
> 
> Is there any chance that this can be used for the last bytes of the
> virtual address space? This might need to change to `_addr == _end` if
> that is possible, otherwise it'll terminate early in that case.
> 

Theoretically it is a possibility. I will change the condition and add a note
for that.

>> +}
> 
> What does the code generation look like for this, relative to the
> assembly version?
> 

The assembly looks like this:

 390:   8b000022        add     x2, x1, x0
 394:   aa0003e1        mov     x1, x0
 398:   d9200821        stg     x1, [x1]
 39c:   91004021        add     x1, x1, #0x10
 3a0:   eb01005f        cmp     x2, x1
 3a4:   54ffffa8        b.hi    398 <mte_set_mem_tag_range+0x48>

You can see the handcrafted one below.

> Thanks,
> Mark.
> 
>> +
>>  
>>  #else /* CONFIG_ARM64_MTE */
>>  
>> diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
>> index 9e1a12e10053..a0a650451510 100644
>> --- a/arch/arm64/lib/mte.S
>> +++ b/arch/arm64/lib/mte.S
>> @@ -150,18 +150,3 @@ SYM_FUNC_START(mte_restore_page_tags)
>>  	ret
>>  SYM_FUNC_END(mte_restore_page_tags)
>>  
>> -/*
>> - * Assign allocation tags for a region of memory based on the pointer tag
>> - *   x0 - source pointer
>> - *   x1 - size
>> - *
>> - * Note: The address must be non-NULL and MTE_GRANULE_SIZE aligned and
>> - * size must be non-zero and MTE_GRANULE_SIZE aligned.
>> - */
>> -SYM_FUNC_START(mte_assign_mem_tag_range)
>> -1:	stg	x0, [x0]
>> -	add	x0, x0, #MTE_GRANULE_SIZE
>> -	subs	x1, x1, #MTE_GRANULE_SIZE
>> -	b.gt	1b
>> -	ret
>> -SYM_FUNC_END(mte_assign_mem_tag_range)
>> -- 
>> 2.30.0
>>

-- 
Regards,
Vincenzo



More information about the linux-arm-kernel mailing list