RISCV Vector unit disabled by default for new task (was Re: [PATCH v12 17/17] riscv: prctl to enable vector commands)

Vineet Gupta vineetg at rivosinc.com
Thu Dec 15 10:57:05 PST 2022



On 12/15/22 07:33, Richard Henderson wrote:
> On 12/15/22 04:28, Florian Weimer via Libc-alpha wrote:
>> * Björn Töpel:
>>
>>>> For SVE, it is in fact disabled by default in the kernel.  When a 
>>>> thread
>>>> executes the first SVE instruction, it will cause an exception, the 
>>>> kernel
>>>> will allocate memory for SVE state and enable TIF_SVE. Further use 
>>>> of SVE
>>>> instructions will proceed without exceptions.  Although SVE is 
>>>> disabled by
>>>> default, it is enabled automatically.  Since this is done 
>>>> automatically
>>>> during an exception handler, there is no opportunity for memory 
>>>> allocation
>>>> errors to be reported, as there are in the AMX case.
>>>
>>> Glibc has an SVE optimized memcpy, right? Doesn't that mean that pretty
>>> much all processes on an SVE capable system will enable SVE 
>>> (lazily)? If
>>> so, that's close to "enabled by default" (unless SVE is disabled system
>>> wide).
>>
>> Yes, see sysdeps/aarch64/multiarch/memcpy.c:
>>
>>    static inline __typeof (__redirect_memcpy) *
>>    select_memcpy_ifunc (void)
>>    {
>>      INIT_ARCH ();
>>         if (sve && HAVE_AARCH64_SVE_ASM)
>>        {
>>          if (IS_A64FX (midr))
>>            return __memcpy_a64fx;
>>          return __memcpy_sve;
>>        }
>>         if (IS_THUNDERX (midr))
>>        return __memcpy_thunderx;
>>         if (IS_THUNDERX2 (midr) || IS_THUNDERX2PA (midr))
>>        return __memcpy_thunderx2;
>>         if (IS_FALKOR (midr) || IS_PHECDA (midr))
>>        return __memcpy_falkor;
>>         return __memcpy_generic;
>>    }
>>    And the __memcpy_sve implementation actually uses SVE.
>>
>> If there were a prctl to select the vector width and enable the vector
>> extension, we'd have to pick a width in glibc anyway.
>
> There *is* a prctl to adjust the SVE vector width, but glibc does not 
> need to select because SVE dynamically adjusts to the currently 
> enabled width.  The kernel selects a default width that fits within 
> the default signal frame size.
>
> The other thing of note for SVE is that, with the default function ABI 
> all of the SVE state is call-clobbered, which allows the kernel to 
> drop instead of save state across system calls.  (There is a separate 
> vector function call ABI when SVE types are used.)

For the RV psABI, it is similar - all V regs are 
caller-saved/call-clobbered [1] and syscalls are not required to 
preserve V regs [2]
However last I checked ARM documentation the ABI doc seemed to suggest 
that some (parts) of the SVE regs are callee-saved [3]

>
> So while strcpy may enable SVE for the thread, the next syscall may 
> disable it again.

Next syscall could trash them, but will it disable SVE ? Despite 
syscall/function-call clobbers, using V in tight loops such as mem*/str* 
still is a win.


[1] 
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
[2] 
https://github.com/riscv/riscv-v-spec/blob/master/calling-convention.adoc
[3] 
https://github.com/ARM-software/abi-aa/blob/2982a9f3b512a5bfdc9e3fea5d3b298f9165c36b/aapcs64/aapcs64.rst#the-base-procedure-call-standard
Sec 6.1.3 ".... In other cases it need only preserve the low 64 bits of 
z8-z15"




More information about the linux-riscv mailing list