[RFC PATCH] arm64/sve: ABI change: Zero SVE regs on syscall entry

Catalin Marinas catalin.marinas at arm.com
Mon Oct 23 10:43:23 PDT 2017


Hi Dave,

On Mon, Oct 23, 2017 at 12:37:44PM +0100, Dave P Martin wrote:
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 6718780..68917de 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -850,6 +850,25 @@ el0_svc:
>  	adrp	stbl, sys_call_table		// load syscall table pointer
>  	mov	wscno, w8			// syscall number in w8
>  	mov	wsc_nr, #__NR_syscalls
> +
> +#ifdef CONFIG_ARM64_SVE
> +alternative_if_not ARM64_SVE
> +	b	1f
> +alternative_else
> +	ldr	x10, [tsk, #TSK_TI_FLAGS]
> +alternative_endif
> +	bic	x10, x10, #_TIF_SVE		// discard SVE state on syscall

Could we have:

	tbz	x10, #TIF_SVE, 1f

> +	str	x10, [tsk, #TSK_TI_FLAGS]
> +
> +	tbz	x10, #TIF_FOREIGN_FPSTATE, 2f
> +	ASM_BUG()
> +2:

If we've never hit this problem before, I think we can avoid the
TIF_FOREIGN_FPSTATE here. Just double-check the return path that we
always invoke do_notify_resume().

> +	mrs	x9, cpacr_el1
> +	bic	x9, x9, #CPACR_EL1_ZEN_EL0EN	// disable SVE for el0
> +	msr	cpacr_el1, x9			// synchronised by eret to el0

I haven't checked the other patches but do we context switch cpacr_el1?
If not, we may return to user with this enabled. Alternatively, can we
disable it on the return path?

> +1:
> +#endif
> +
>  el0_svc_naked:					// compat entry point
>  	stp	x0, xscno, [sp, #S_ORIG_X0]	// save the original x0 and syscall number
>  	enable_dbg_and_irq

I think you could move the TIF_SVE checks above in el0_svc_naked and
reuse x16 to avoid loading TSK_TI_FLAGS twice. For compat tasks TIF_SVE
would be 0 anyway (as long as we can skip the whole block as per my
suggested tbz).

> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index ac56a96..9a3c561 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -247,22 +247,22 @@ void arch_release_task_struct(struct task_struct *tsk)
>  	fpsimd_release_task(tsk);
>  }
>  
> +/*
> + * src and dst may temporarily have aliased sve_state after task_struct
> + * is copied.  We cannot fix this properly here, because src may have
> + * live SVE state and dst's thread_info may not exist yet, so tweaking
> + * either src's or dst's TIF_SVE is not safe.
> + *
> + * The unaliasing is done in copy_thread() instead.  This works because
> + * dst is not schedulable or traceable until both of these functions
> + * have been called.
> + */
>  int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
>  {
> -	/*
> -	 * For SVE, dst and src must not end up with aliases of the same
> -	 * sve_state.  Because we are definitely in a syscall here, SVE state
> -	 * can be discarded unconditionally without violating the user ABI.
> -	 * dst's sve_state pointer can then be zapped with no ill effects.
> -	 *
> -	 * src can safely retain its sve_state memory for later use.
> -	 */
>  	if (current->mm)
> -		fpsimd_preserve_current_state_discard_sve();
> +		fpsimd_preserve_current_state();
>  	*dst = *src;
>  
> -	dst->thread.sve_state = NULL;
> -
>  	return 0;
>  }
>  
> @@ -275,6 +275,13 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
>  
>  	memset(&p->thread.cpu_context, 0, sizeof(struct cpu_context));
>  
> +	/*
> +	 * Unalias p->thread.sve_state (if any) from the parent task
> +	 * and disable discard SVE state for p:
> +	 */
> +	clear_tsk_thread_flag(p, TIF_SVE);
> +	p->thread.sve_state = NULL;
> +
>  	if (likely(!(p->flags & PF_KTHREAD))) {
>  		*childregs = *current_pt_regs();
>  		childregs->regs[0] = 0;

This looks fine.

-- 
Catalin



More information about the linux-arm-kernel mailing list