[PATCH v2 11/28] arm64/sve: Core task context handling
Dave Martin
Dave.Martin at arm.com
Tue Oct 3 04:33:03 PDT 2017
On Wed, Sep 13, 2017 at 07:33:25AM -0700, Catalin Marinas wrote:
> On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:
> > +/*
> > + * Handle SVE state across fork():
> > + *
> > + * dst and src must not end up with aliases of the same sve_state.
> > + * Because a task cannot fork except in a syscall, we can discard SVE
> > + * state for dst here, so long as we take care to retain the FPSIMD
> > + * subset of the state if SVE is in use. Reallocation of the SVE state
> > + * will be deferred until dst tries to use SVE.
> > + */
> > +void fpsimd_dup_sve(struct task_struct *dst, struct task_struct const *src)
> > +{
> > + if (test_and_clear_tsk_thread_flag(dst, TIF_SVE)) {
> > + WARN_ON(dst->mm && !in_syscall(task_pt_regs(dst)));
> > + sve_to_fpsimd(dst);
> > + }
> > +
> > + dst->thread.sve_state = NULL;
> > +}
>
> I first thought the thread flags are not visible in dst yet since
> dup_task_struct() calls arch_dup_task_struct() before
> setup_thread_stack(). However, at the end of the last year we enabled
> CONFIG_THREAD_INFO_IN_TASK_STRUCT. But I don't particularly like relying
> on this.
>
> Anyway, IIUC we don't need sve_to_fpsimd() here. The
> arch_dup_task_struct() already called fpsimd_preserve_current_state()
> for src, so the FPSIMD state (which we care about) is transferred during
> the *dst = *src assignment. So you'd only need the last statement,
> possibly with a different function name like fpsimd_erase_sve (and maybe
> make the function static inline in the header).
Regarding the intended meanings of the thread flags, does this help?
--8<--
TIF_FOREIGN_FPSTATE's meaning is expanded to cover SVE, but otherwise
unchanged:
* If a task is running and !TIF_FOREIGN_FPSTATE, then the the CPU
registers of the CPU the task is running on contain the authoritative
FPSIMD/SVE state of the task. The backing memory may be stale.
* Otherwise (i.e., task not running, or task running and
TIF_FOREIGN_FPSTATE), the task's FPSIMD/SVE backing memory is
authoritative. If additionally per_cpu(fpsimd_last_state,
task->fpsimd_state.cpu) == &task->fpsimd_state.cpu, then
task->fpsimd_state.cpu's registers are also up to date for task, but
not authorititive: the current FPSIMD/SVE state may be read from
them, but they must not be written.
The FPSIMD/SVE backing memory is selected by TIF_SVE:
* TIF_SVE set: Zn (incorporating Vn in bits[127:0]), Pn and FFR are
stored in task->thread.sve_state, formatted appropriately for vector
length task->thread.sve_vl. task->thread.sve_state must point to a
valid buffer at least sve_state_size(task) bytes in size.
* TIF_SVE clear: Vn are stored in task->fpsimd_state; Zn[max : 128] are
logically zero[*] but not stored anywhere; Pn, FFR are not stored and
have unspecified values from userspace's point of view.
task->thread.sve_state does not need to be non-null, valid or any
particular size: it must not be dereferenced.
In practice I don't exploit the "unspecifiedness" much. The Zn high
bits, Pn and FFR are all zeroed when setting TIF_SVE again:
sve_alloc() is the common path for this.
* FPSR and FPCR are always stored in task->fpsimd_state irrespctive of
whether TIF_SVE is clear or set, since these are not vector length
dependent.
[*] theoretically unspecified, which is what I've written in sve.txt.
However, on any FPSIMD Vn write the upper bits of the corresponding Zn
must become logically zero in order to conform to the SVE programmer's
model. It's not feasible to track which Vn have been written
individually because that would involve trapping every FPSIMD insn until
all possible Vn have been written. So the kernel ensures that the Zn
high bits become zero.
Maybe we should just guarantee "zero-or-preserved" behaviour for
userspace. This may close down some future optimisation opportunities,
but maybe it's better to be simple.
Thoughts?
[...]
Cheers
---Dave
More information about the linux-arm-kernel
mailing list