[PATCH 17/18] arm64: fpsimd: Move SME save/restore inline
Vladimir Murzin
vladimir.murzin at arm.com
Wed May 27 02:00:36 PDT 2026
Hi Mark,
On 5/26/26 17:38, Mark Rutland wrote:
> On Tue, May 26, 2026 at 04:28:17PM +0100, Mark Rutland wrote:
>> On Tue, May 26, 2026 at 03:39:56PM +0100, Vladimir Murzin wrote:
>>> Hi Mark,
>>>
>>> On 5/26/26 15:08, Mark Rutland wrote:
>>>> On Thu, May 21, 2026 at 02:25:55PM +0100, Mark Rutland wrote:
>>>>> +static inline void __sme_save_za(struct sme_state *state, unsigned long svl)
>>>>> +{
>>>>> + /* The <Wv> argument to STR (array vector) can only encode W12-W15 */
>>>>> + register unsigned long v asm ("12");
>>>> Sorry, I had meant to put "x12" here, but evidently GCC and LLVM accept
>>>> "12" on its own.
>>>>
>>>> For clarity (e.g. to match the comment) I'll change that to "w12" and
>>>> make the type unsigned int. Likewise in __sme_load_za().
>>> I suspect you are intentionally not using "Ucj" constrain to limit register allocator,
>>> if so I'm wondering why?
>> Thanks for the suggestion; that was ignorance rather than intent.
>>
>> I was not aware of "Ucj" as it doesn't appear on the public GCC
>> documentation:
>>
>> https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html
>>
>> Looking at the machine description file, that's marked with '@internal',
>> so IIUC GCC folk don't seem to expect/want people to use it. That said,
>> LLVM seems to support it.
>>
>> I'll go check that all relevant toolchains support this, and poke GCC
>> folk to see if they're happy to promote that to a public constraint.
> GCC folk seem happy to make this public, which is great! I'll cross-link
> a thread here if/when patches appear.
>
> In the short term, using "Ucj" would require bumping our minimum
> supported toolchain necessary for SME:
>
> * GCC gained "Ucj" in 14.1.0, tagged on 7 May 2024.
>
> * LLVM gained "Ucj" in 18.1.0, tagged on 27 Feb 2024.
>
> ... so using that would require adding a dependency on a newer
> toolchain, e.g. via a CC_HAS_UCJ_CONSTRAINT to match the existing
> CC_HAS_K_CONSTRAINT.
>
> Aligned with the rationale on patch 8, v6.16 (tagged 27 July 2025) was
> contemporary with GCC 15.1.0 (tagged 25 April 2025) and LLVM 20.1.0
> (tagged 4 March 2025), both of which supported "Ucj".
>
>> If that's all good, I'll move over to "Ucj". If not, I'll update the
>> commit message and/or comments to explain why.
> If Will and Catalin are happy to depend on a toolchain as above, I'll go
> add the necessary CC_HAS_UCJ_CONSTRAINT Kconfig logic.
>
> Otherwise I'll go note the above in a comment, and stick with the
> register variable for now.
>
> Mark.
>
Wow, I had no intention for generating this amount of work for
you - thanks for digging into that! FWIW, either way works for me :)
Cheers
Vladimir
More information about the linux-arm-kernel
mailing list