[PATCH v2 11/28] arm64/sve: Core task context handling

Dave Martin Dave.Martin at arm.com
Thu Sep 14 12:40:41 PDT 2017


On Wed, Sep 13, 2017 at 03:21:29PM -0700, Catalin Marinas wrote:
> On Wed, Sep 13, 2017 at 08:17:07PM +0100, Dave P Martin wrote:
> > On Wed, Sep 13, 2017 at 10:26:05AM -0700, Catalin Marinas wrote:
> > > On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:
> > > > +/*
> > > > + * Trapped SVE access
> > > > + */
> > > > +void do_sve_acc(unsigned int esr, struct pt_regs *regs)
> > > > +{
> > > > +	/* Even if we chose not to use SVE, the hardware could still trap: */
> > > > +	if (unlikely(!system_supports_sve()) || WARN_ON(is_compat_task())) {
> > > > +		force_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);
> > > > +		return;
> > > > +	}
> > > > +
> > > > +	task_fpsimd_save();
> > > > +
> > > > +	sve_alloc(current);
> > > > +	fpsimd_to_sve(current);
> > > > +	if (test_and_set_thread_flag(TIF_SVE))
> > > > +		WARN_ON(1); /* SVE access shouldn't have trapped */
> > > > +
> > > > +	task_fpsimd_load();
> > > > +}
> > > 
> > > When this function is entered, do we expect TIF_SVE to always be
> > > cleared? It's worth adding a comment on the expected conditions. If
> > 
> > Yes, and this is required for correctness, as you observe.
> > 
> > I had a BUG_ON() here which I removed, but it makes sense to add a
> > comment to capture the precondition here, and how it is satisfied.
> > 
> > > that's the case, task_fpsimd_save() would only save the FPSIMD state
> > > which is fine. However, you subsequently transfer the FPSIMD state to
> > > SVE, set TIF_SVE and restore the full SVE state. If we don't care about
> > > the SVE state here, can we call task_fpsimd_load() *before* setting
> > > TIF_SVE?
> > 
> > There should be no way to reach this code with TIF_SVE set, unless
> > task_fpsimd_load() sets the CPACR trap bit wrongly, or the hardware is
> > broken -- either of which is a bug.
> 
> Thanks for confirming my assumptions. What I meant was rewriting the
> above function as:
> 
> 	/* reset the SVE state (other than FPSIMD) */
> 	task_fpsimd_save();
> 	task_fpsimd_load();

I think this works, but can you explain your rationale?

I think the main effect of your suggestion is that it is cheaper, due
to eliminating some unnecessary load/store operations.

We could go one better, and do

	mov	v0.16b, v0.16b
	mov	v1.16b, v1.16b
	// ...
	mov	v31.16b, v31.16b

which doesn't require any memory access.

But I still prefer to zero p0..p15, ffr for cleanliness, even though
the SVE programmer's model doesn't require this (unlike for the Z-reg
high bits where we do need to zero them in order not to violate the
programmer's model).

Currently sve_alloc()+task_fpsimd_load() ensures that all the non-FPSIMD
regs are zeroed too, in addition to the Z-reg high bits.

So we might want a special-purpose helper -- if so, we can do it all
with no memory access.

	pfalse	p0.b
	// ..
	pfalse	p15.b
	wrffr	p0.b

This would allow the memset-zero an sve_alloc() to be removed, but I
would need to check what other code is relying on it.

I guess I hadn't done this because I viewed it as an optimisation.

Thoughts?

Cheers
---Dave



More information about the linux-arm-kernel mailing list