[PATCH 3/3] arm64: lib: Use MOPS for usercopy routines

Kristina Martšenko kristina.martsenko at arm.com
Fri Feb 21 11:39:23 PST 2025


On 20/02/2025 19:15, Robin Murphy wrote:
> On 18/02/2025 5:14 pm, Kristina Martšenko wrote:
>> --- a/arch/arm64/lib/clear_user.S
>> +++ b/arch/arm64/lib/clear_user.S
>> @@ -17,14 +17,27 @@
>>    * Alignment fixed up by hardware.
>>    */
>>   -    .p2align 4
>> -    // Alignment is for the loop, but since the prologue (including BTI)
>> -    // is also 16 bytes we can keep any padding outside the function
>>   SYM_FUNC_START(__arch_clear_user)
>>       add    x2, x0, x1
> 
> This subtlety...
> 
>> +
>> +#ifdef CONFIG_AS_HAS_MOPS
>> +    .arch_extension mops
>> +alternative_if_not ARM64_HAS_MOPS
>> +    b    .Lno_mops
>> +alternative_else_nop_endif
>> +
>> +USER(9f, setpt    [x0]!, x1!, xzr)
>> +USER(6f, setmt    [x0]!, x1!, xzr)
>> +USER(6f, setet    [x0]!, x1!, xzr)
>> +    mov    x0, #0
>> +    ret
>> +.Lno_mops:
>> +#endif
>> +
>>       subs    x1, x1, #8
>>       b.mi    2f
>> -1:
>> +
>> +1:    .p2align 4
>>   USER(9f, sttr    xzr, [x0])
>>       add    x0, x0, #8
>>       subs    x1, x1, #8
>> @@ -47,6 +60,10 @@ USER(7f, sttrb    wzr, [x2, #-1])
>>       ret
>>         // Exception fixups
>> +6:    b.cs    9f
>> +    // Registers are in Option A format
>> +    add    x0, x0, x1
>> +    b    9f
> 
> ...and then all this, is a bit hard to follow IMO. I'd be inclined to just have dedicated "mov x0, x1; ret" and "cneg x0, x1, cc; ret" fixups for the prologue and other ops, rather than entangle them with the non-MOPS flow at all. (Plus then the prologue fixup could arguably be the normal exit path as well...)

I think it's easier to follow if the logic is similar to the copy fixups.

> 
>>   7:    sub    x0, x2, #5    // Adjust for faulting on the final byte...
>>   8:    add    x0, x0, #4    // ...or the second word of the 4-7 byte case
>>   9:    sub    x0, x2, x0
>> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
>> index 34e317907524..400057d607ec 100644
>> --- a/arch/arm64/lib/copy_from_user.S
>> +++ b/arch/arm64/lib/copy_from_user.S
>> @@ -52,6 +52,13 @@
>>       stp \reg1, \reg2, [\ptr], \val
>>       .endm
>>   +    .macro cpy1 dst, src, count
>> +    .arch_extension mops
>> +    USER_CPY(9997f, 0, cpyfprt [\dst]!, [\src]!, \count!)
>> +    USER_CPY(9996f, 0, cpyfmrt [\dst]!, [\src]!, \count!)
>> +    USER_CPY(9996f, 0, cpyfert [\dst]!, [\src]!, \count!)
>> +    .endm
>> +
>>   end    .req    x5
>>   srcin    .req    x15
>>   SYM_FUNC_START(__arch_copy_from_user)
>> @@ -62,6 +69,9 @@ SYM_FUNC_START(__arch_copy_from_user)
>>       ret
>>         // Exception fixups
>> +9996:    b.cs    9997f
>> +    // Registers are in Option A format
>> +    add    dst, dst, count
> 
> However for copies it's somewhat justified since, IIUC, MOPS aren't guaranteed to make progress if we're starting on the last byte of a page and the next page is unmapped, and thus we may still have a "try harder" requirement similar to the previous alignment fault case, is that right?

Yes, that's right.

Thanks,
Kristina

>>   9997:    cmp    dst, dstin
>>       b.ne    9998f
>>       // Before being absolutely sure we couldn't copy anything, try harder




More information about the linux-arm-kernel mailing list