[PATCH v5 1/2] arm64/sve: Don't disable SVE on syscalls return

Dave Martin Dave.Martin at arm.com
Mon Nov 16 12:59:39 EST 2020


On Fri, Nov 13, 2020 at 08:13:28PM +0000, Mark Brown wrote:
> On Fri, Nov 13, 2020 at 06:48:56PM +0000, Catalin Marinas wrote:
> > On Fri, Nov 06, 2020 at 07:35:52PM +0000, Mark Brown wrote:
> > > From: Julien Grall <julien.grall at arm.com>
> 
> > > We could instead handle flushing the SVE state in do_el0_svc() however
> > > doing this reduces the potential for further optimisations such as
> > > initializing the SVE registers directly from the FPSIMD state when
> > > taking a SVE access trap and has some potential edge cases if we
> > > schedule before we return to userspace after do_el0_svc().
> 
> > Ah, you covered the do_el0_svc() topic here already. Is this potential
> > optimisation prevented because TIF_SVE has two meanings: SVE access
> > enabled and sve_state valid (trying to page this code in)? For example,
> > task_fpsimd_load() uses TIF_SVE to check whether to load from sve_state
> > or from fpsimd. fpsimd_bind_task_to_cpu() uses TIF_SVE to enable the
> > user access.
> 
> Yes, there's some overloading with the storage for the SVE register file
> and the new flag is effectively about the storage.

Right.

First and foremost, TIF_SVE means "userspace is allowed to access the
SVE regs".

To guarantee that we have somewhere to save the regs at short notice, I
also made sure that the SVE regs backing storage (sve_state) is
allocated before setting this flag.  This allows critical path code to
assume the storage is valid whenever a need to save the SVE regs arises.



So, TIF_FOREIGN_FPSTATE and TIF_SVE are independent, and mean:

 * TIF_SVE true: sve_state allocated (and large enough); userspace is
allowed to access the SVE regs directly.

 * TIF_SVE false: sve_state may not be allocated; userspace is not
allowed to access the SVE regs directly; their logical contents are all
0 (except for vl, which is persistent independent of all these flags).

 * TIF_FOREIGN_FPSTATE false (and task running): the hardware regs are
up to date for current; task may be in user or kernelspace.

 * TIF_FOREIGN_FPSTATE true (and task running): the hardware regs may
not be up to date for current; task definitely in kernelspace.

 * task scheduled out: similar to TIF_FOREIGN_FPSTATE true: the hardware
regs may not be up to date for task; task in kernelspace FWIW.  For the
most part, we can treat this the same as "TIF_FOREIGN_FPSTATE true".


So far as I can see, TIF_SVE_NEEDS_FLUSH is a special state in which SVE
is half-enabled and the SVE regs half-loaded, so it's not an orthogonal
flag, but a distinct state that sits between the others.

In this state the regs are neither fully saved nor fully loaded, and we
haven't committed to either enabling or disabling SVE for userspace.
It's an intermediate state which we can choose our path out of based on
policy or performance concerns.

The new state's closest relative is probably TIF_SVE &&
!TIF_FOREIGN_FPSTATE.  Akin to that state, sve_state is valid, and
ZCR_EL1.LEN is loaded.  But we do have a bit of work to do in order to
transition to most of the other states.

I will try to come up with a state diagram, but not today...


I'll look at the code and these other comments tomorrow.

[...]

Cheers
---Dave



More information about the linux-arm-kernel mailing list