[PATCH v3 8/8] arm64/sve: Rework SVE trap access to use TIF_SVE_NEEDS_FLUSH

Dave Martin Dave.Martin at arm.com
Wed Jul 15 12:52:54 EDT 2020


On Mon, Jun 29, 2020 at 02:35:56PM +0100, Mark Brown wrote:
> From: Julien Grall <julien.grall at arm.com>
> 
> SVE state will be flushed on the first SVE access trap. At the moment,
> the SVE state will be generated from the FPSIMD state in software and
> then loaded in memory.
> 
> It is possible to use the newly introduce flag TIF_SVE_NEEDS_FLUSH to
> avoid a lot of memory access.
> 
> If the FPSIMD state is in memory, the SVE state will be loaded on return
> to userspace from the FPSIMD state.
> 
> If the FPSIMD state is loaded, then we need to set the vector-length
> before relying on return to userspace to flush the SVE registers. This
> is because the vector length is only set when loading from memory. We
> also need to rebind the task to the CPU so the newly allocated SVE state
> is used when the task is saved.

Reasonable overall, I think.

A few minor queries below.

> Signed-off-by: Julien Grall <julien.grall at arm.com>
> Signed-off-by: Mark Brown <broonie at kernel.org>
> ---
>  arch/arm64/include/asm/fpsimd.h  |  2 ++
>  arch/arm64/kernel/entry-fpsimd.S |  5 +++++
>  arch/arm64/kernel/fpsimd.c       | 35 ++++++++++++++++++++++----------
>  3 files changed, 31 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index bec5f14b622a..e60aa4ebb351 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -74,6 +74,8 @@ extern void sve_load_from_fpsimd_state(struct user_fpsimd_state const *state,
>  				       unsigned long vq_minus_1);
>  extern unsigned int sve_get_vl(void);
>  
> +extern void sve_set_vq(unsigned long vq_minus_1);
> +
>  struct arm64_cpu_capabilities;
>  extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
>  
> diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
> index 5b1a9adfb00b..476c8837a7e5 100644
> --- a/arch/arm64/kernel/entry-fpsimd.S
> +++ b/arch/arm64/kernel/entry-fpsimd.S
> @@ -48,6 +48,11 @@ SYM_FUNC_START(sve_get_vl)
>  	ret
>  SYM_FUNC_END(sve_get_vl)
>  

Might be worth a comment here to remind us that x0 is the vq minus 1.

> +SYM_FUNC_START(sve_set_vq)
> +	sve_load_vq x0, x1, x2
> +	ret
> +SYM_FUNC_END(sve_set_vq)
> +
>  /*
>   * Load SVE state from FPSIMD state.
>   *
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index ccbc38b71069..dfe2e19ce591 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -944,10 +944,10 @@ void fpsimd_release_task(struct task_struct *dead_task)
>  /*
>   * Trapped SVE access
>   *
> - * Storage is allocated for the full SVE state, the current FPSIMD
> - * register contents are migrated across, and TIF_SVE is set so that
> - * the SVE access trap will be disabled the next time this task
> - * reaches ret_to_user.
> + * Storage is allocated for the full SVE state so that the code
> + * running subsequently has somewhere to save the SVE registers to. We
> + * then rely on ret_to_user to actually convert the FPSIMD registers
> + * to SVE state by flushing as required.
>   *
>   * TIF_SVE should be clear on entry: otherwise, fpsimd_restore_current_state()
>   * would have disabled the SVE access trap for userspace during
> @@ -965,14 +965,24 @@ void do_sve_acc(unsigned int esr, struct pt_regs *regs)
>  
>  	get_cpu_fpsimd_context();
>  
> -	fpsimd_save();
> -
> -	/* Force ret_to_user to reload the registers: */
> -	fpsimd_flush_task_state(current);
> +	set_thread_flag(TIF_SVE_NEEDS_FLUSH);
> +	/*
> +	 * We should not be here with SVE enabled. TIF_SVE will be set
> +	 * before returning to userspace by fpsimd_restore_current_state().
> +	 */
> +	WARN_ON(test_thread_flag(TIF_SVE));
>  
> -	fpsimd_to_sve(current);
> -	if (test_and_set_thread_flag(TIF_SVE))
> -		WARN_ON(1); /* SVE access shouldn't have trapped */
> +	/*
> +	 * When the FPSIMD state is loaded:
> +	 *	- The return path (see fpsimd_restore_current_state) requires
> +	 *	  the vector length t be loaded beforehand.

Nit: to

> +	 *	- We need to rebind the task to the CPU so the newly allocated
> +	 *	  SVE state is used when the task is saved.
> +	 */
> +	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
> +		sve_set_vq(sve_vq_from_vl(current->thread.sve_vl) - 1);
> +		fpsimd_bind_task_to_cpu();

Hmm, does this actually to the sve_user_enable(), duplicating the
sve_user_enable() in fpsimd_restore_current_state()?

> +	}
>  
>  	put_cpu_fpsimd_context();
>  }
> @@ -1189,6 +1199,9 @@ void fpsimd_restore_current_state(void)
>  		/*
>  		 * The userspace had SVE enabled on entry to the kernel
>  		 * and requires the state to be flushed.
> +		 *
> +		 * We rely on the Vector-Length to be set correctly before-hand

Trivial nit: I think we normally just write "vector length".

Could be worth saying where it gets done (i.e., do_sve_acc()).

> +		 * when converting a loaded FPSIMD state to SVE state.
>  		 */
>  		sve_flush_live();
>  		sve_user_enable();

Possibly redundant?  See do_sve_acc().

Cheers
---Dave



More information about the linux-arm-kernel mailing list