[PATCH v2 3/3] RISC-V: add T-Head vector errata handling

Thu Jun 29 09:06:04 PDT 2023

	Hi,

Le perjantaina 23. kesäkuuta 2023, 2.13.05 EEST Heiko Stuebner a écrit :
> diff --git a/arch/riscv/include/asm/errata_list.h
> b/arch/riscv/include/asm/errata_list.h index fb1a810f3d8c..ab21fadbe9c6
> 100644
> --- a/arch/riscv/include/asm/errata_list.h
> +++ b/arch/riscv/include/asm/errata_list.h
> @@ -154,6 +155,48 @@ asm volatile(ALTERNATIVE(				
		\
> 
>  	: "=r" (__ovl) :						
\
>  	: "memory")
> 
> +#ifdef CONFIG_ERRATA_THEAD_VECTOR
> +
> +#define THEAD_C9XX_CSR_VXSAT			0x9
> +#define THEAD_C9XX_CSR_VXRM			0xa
> +
> +/*
> + * Vector 0.7.1 as used for example on T-Head Xuantie cores, uses an older
> + * encoding for vsetvli (ta, ma vs. d1), so provide an instruction for
> + * vsetvli	t4, x0, e8, m8, d1
> + */
> +#define THEAD_VSETVLI_T4X0E8M8D1	".long	0x00307ed7\n\t"

That is equivalent to, and (IMHO) much less legible than:
".insn   i OP_V, 7, t4, x0, 3"
Or even if you don't mind second-guessing RVV 1.0 assemblers:
"vsetvli t4, zero, e8, m8, tu, mu"

Either way, you don't need to hard-code X-register operands in assembler 
macros (though you do unfortunately need to hard-code V register operands if 
you use .insn).

> +
> +/*
> + * While in theory, the vector-0.7.1 vsb.v and vlb.v result in the same
> + * encoding as the standard vse8.v and vle8.v,

Not only in theory. vse8.v and vle8.v have only one possible encoding each 
(for given operands).

> compilers seem to optimize

Nit: By "compilers", do you mean "assemblers"? That's a bit misleading to me.

> + * the call resulting in a different encoding and then using a value for
> + * the "mop" field that is not part of vector-0.7.1

Uh, no? They use mew = 0b0 and mop = 0b00, which corresponds to mop = 0b000.

> + * So encode specific variants for vstate_save and _restore.
> + */
> +#define THEAD_VSB_V_V0T0		".long	0x02028027\n\t"

That's "vse8.v v0, (t0)", at least as assembled with binutils 2.40.50.20230625 
(from Debian unstable). I don't understand the rationale for hard-coding from 
the above comment. Maybe that's just me being an idiot, but if so, then the 
comment ought to be clarified.

(I do realise that vse8.v and vsb.v are not exactly equivalent in behaviour, 
but here, the concern should be the assembler, not the processor.)

> +#define THEAD_VSB_V_V8T0		".long	0x02028427\n\t"
> +#define THEAD_VSB_V_V16T0		".long	0x02028827\n\t"
> +#define THEAD_VSB_V_V24T0		".long	0x02028c27\n\t"
> +#define THEAD_VLB_V_V0T0		".long	0x012028007\n\t"

This has one nibble too many for a 32-bit value.

And why use sign-extended loads? Zero-extended loads would have the exact same 
encoding as vle8.v, and not need this dark magic, AFAICT.

> +#define THEAD_VLB_V_V8T0		".long	0x012028407\n\t"
> +#define THEAD_VLB_V_V16T0		".long	0x012028807\n\t"
> +#define THEAD_VLB_V_V24T0		".long	0x012028c07\n\t"
> +
> +#define ALT_SR_VS_VECTOR_1_0_SHIFT	9
> +#define ALT_SR_VS_THEAD_SHIFT		23
> +
> +#define ALT_SR_VS(_val, prot)					
	\
> +asm(ALTERNATIVE("li %0, %1\t\nslli %0,%0,%3",				
\
> +		"li %0, %2\t\nslli %0,%0,%4", THEAD_VENDOR_ID,		
\
> +		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)	\
> +		: "=r"(_val)					
	\
> +		: "I"(prot##_1_0 >> ALT_SR_VS_VECTOR_1_0_SHIFT),	\
> +		  "I"(prot##_THEAD >> ALT_SR_VS_THEAD_SHIFT),		
\
> +		  "I"(ALT_SR_VS_VECTOR_1_0_SHIFT),			
\
> +		  "I"(ALT_SR_VS_THEAD_SHIFT))
> +#endif /* CONFIG_ERRATA_THEAD_VECTOR */
> +
>  #endif /* __ASSEMBLY__ */
> 
>  #endif

-- 
レミ・デニ-クールモン
http://www.remlab.net/