[PATCH 17/18] arm64: fpsimd: Move SME save/restore inline

Mark Rutland mark.rutland at arm.com
Tue May 26 09:38:40 PDT 2026


On Tue, May 26, 2026 at 04:28:17PM +0100, Mark Rutland wrote:
> On Tue, May 26, 2026 at 03:39:56PM +0100, Vladimir Murzin wrote:
> > Hi Mark,
> > 
> > On 5/26/26 15:08, Mark Rutland wrote:
> > > On Thu, May 21, 2026 at 02:25:55PM +0100, Mark Rutland wrote:
> > >> +static inline void __sme_save_za(struct sme_state *state, unsigned long svl)
> > >> +{
> > >> +	/* The <Wv> argument to STR (array vector) can only encode W12-W15 */
> > >> +	register unsigned long v asm ("12");
> > > Sorry, I had meant to put "x12" here, but evidently GCC and LLVM accept
> > > "12" on its own.
> > > 
> > > For clarity (e.g. to match the comment) I'll change that to "w12" and
> > > make the type unsigned int. Likewise in __sme_load_za().
> > 
> > I suspect you are intentionally not using "Ucj" constrain to limit register allocator,
> > if so I'm wondering why?
> 
> Thanks for the suggestion; that was ignorance rather than intent.
> 
> I was not aware of "Ucj" as it doesn't appear on the public GCC
> documentation:
> 
>   https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html
> 
> Looking at the machine description file, that's marked with '@internal',
> so IIUC GCC folk don't seem to expect/want people to use it. That said,
> LLVM seems to support it.
> 
> I'll go check that all relevant toolchains support this, and poke GCC
> folk to see if they're happy to promote that to a public constraint.

GCC folk seem happy to make this public, which is great! I'll cross-link
a thread here if/when patches appear.

In the short term, using "Ucj" would require bumping our minimum
supported toolchain necessary for SME:

* GCC gained "Ucj" in 14.1.0, tagged on 7 May 2024.

* LLVM gained "Ucj" in 18.1.0, tagged on 27 Feb 2024.

... so using that would require adding a dependency on a newer
toolchain, e.g. via a CC_HAS_UCJ_CONSTRAINT to match the existing
CC_HAS_K_CONSTRAINT.

Aligned with the rationale on patch 8, v6.16 (tagged 27 July 2025) was
contemporary with GCC 15.1.0 (tagged 25 April 2025) and LLVM 20.1.0
(tagged 4 March 2025), both of which supported "Ucj".

> If that's all good, I'll move over to "Ucj". If not, I'll update the
> commit message and/or comments to explain why.

If Will and Catalin are happy to depend on a toolchain as above, I'll go
add the necessary CC_HAS_UCJ_CONSTRAINT Kconfig logic.

Otherwise I'll go note the above in a comment, and stick with the
register variable for now.

Mark.



More information about the linux-arm-kernel mailing list