[PATCH v1 3/3] arm64/sve: Skip flushing Z registers with 128 bit vectors

Mark Brown broonie at kernel.org
Mon May 10 09:16:58 PDT 2021


On Mon, May 10, 2021 at 04:08:09PM +0100, Dave P Martin wrote:
> On Mon, May 10, 2021 at 01:23:48PM +0100, Mark Brown wrote:

> >  SYM_FUNC_START(sve_flush_live)
> > +	cbz		x0, 1f	// A VQ-1 of 0 is 128 bits so no extra Z state

> Should we worry about branch mispredicts here?  It may be in the noise,
> but I wonder whether it's worth considering use of alternatives here
> instead.

If people are happy adding an alternative we can definitely do that,
people seemed to want to avoid them in the past and at this point I
don't have concrete data to support how much of a win is but it seems
very likely that it'll have the best overall performance - systems that
only have 128 bit vectors will never have to worry about the non-shared
bits and...

> I have a suspicion that VL = 128 bits won't be common at runtime, except
> in the case of systems where the physical (or max usable) vector length
> (i.e., sve_max_vl) is 128 bits.  

...like you I expect that systems with more than 128 bits won't tend to
configure down to 128 bits.  At the minute it's kind of finger in the
air what the practical impact actually is though, quite a lot of
unresolved variables.

Given the recently announced requirement for SVE in v9 I'd expect that
we'll actually see quite a lot of 128 bit systems in the wild for at
least some period, like with our own Neoverse N2 cores.

> > +		unsigned long vq_minus_one =
> > +			sve_vq_from_vl(current->thread.sve_vl) - 1;
> > +		sve_set_vq(vq_minus_one);
> > +		sve_flush_live(vq_minus_one);

> Seems reasonable.  sve_flush_live() could alternatively be made a C
> function, with asm wrappers for sve_flush_{z,p,ffr} so that the
> conditional logic can be inlined -- but I can't see that it would
> improve the generated code much.  So I'd be happy with it to stay in
> this form.

Yeah, I faffed a bit with options but it seemed like the effort wasn't
going to be worth it, mainly inflating the size of the code change.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20210510/ac087a6e/attachment-0001.sig>


More information about the linux-arm-kernel mailing list