[PATCH] ARM: vfp: Add vudot opcode to VFP undef hook

Mark-PK Tsai mark-pk.tsai at mediatek.com
Wed Sep 20 19:13:50 PDT 2023


> On 2023-09-20 09:39, Mark-PK Tsai wrote:
> > Add vudot opcode to the VFP undef hook to fix the
> > potentially undefined instruction error when the
> > user space executes vudot instruction.
> 
> Did the kernel expose a hwcap to say that the dot product extension is 
> supported? I'm pretty sure it didn't, so why would userspace expect this 
> to work? ;)

The hwcap for dotprod has been exported since commit:

62ea0d873af3 ARM: 9269/1: vfp: Add hwcap for FEAT_DotProd

> 
> IIRC Amit was looking at defining the hwcaps to align with arm64 compat, 
> but I believe that series faltered since most of them weren't actually 
> needed (and I think at that point it was still missing the VFP support 
> code parts). It would be nice if someone could pick up and combine both 

Were the mentioned series related to this commit?

62ea0d873af3 ARM: 9269/1: vfp: Add hwcap for FEAT_DotProd

> efforts and get this done properly; fill in *all* the hwcaps and 
> relevant handling for extensions which Cortex-A55 supports (since 
> there's definitely more than just VUDOT), and then hopefully we're done 
> for good.

Agree.

> 
> > Before this commit, kernel didn't handle the undef exception
> > caused by vudot and didn't enable VFP in lazy VFP context
> > switch code like other NEON instructions.
> > This led to the occurrence of the undefined instruction
> > error as following:
> > 
> > [  250.741238 ] 0904 (26902): undefined instruction: pc=004014ec
> > ...
> > [  250.741287 ] PC is at 0x4014ec
> > [  250.741298 ] LR is at 0xb677874f
> > [  250.741303 ] pc : [<004014ec>]    lr : [<b677874f>]    psr: 80070010
> > [  250.741309 ] sp : beffedb0  ip : b67d7864  fp : beffee58
> > [  250.741314 ] r10: 00000000  r9 : 00000000  r8 : 00000000
> > [  250.741319 ] r7 : 00000001  r6 : 00000001  r5 : beffee90  r4 : 00401470
> > [  250.741324 ] r3 : beffee20  r2 : beffee30  r1 : beffee40  r0 : 004003a8
> > [  250.741331 ] Flags: Nzcv  IRQs on  FIQs on  Mode USER_32  ISA ARM Segment user
> > [  250.741339 ] Control: 10c5383d  Table: 32d0406a  DAC: 00000055
> > [  250.741348 ] Code: f4434aef f4610aef f4622aef f4634aef (fc620df4)
> > 
> > Below is the assembly of the user program:
> > 
> > 0x4014dc <+108>: vst1.64 {d20, d21}, [r3:128]
> > 0x4014e0 <+112>: vld1.64 {d16, d17}, [r1:128]
> > 0x4014e4 <+116>: vld1.64 {d18, d19}, [r2:128]
> > 0x4014e8 <+120>: vld1.64 {d20, d21}, [r3:128] --> switch out
> > 0x4014ec <+124>: vudot.u8 q8, q9, q10         <-- switch in, and FPEXC.EN = 0
> >                                                    SIGILL(illegal instruction)
> > 
> > Link: https://services.arm.com/support/s/case/5004L00000XsOjP
> 
> Linking to your private support case is not useful to upstream. Even I 
> can't open that link.

I thought that maybe someone in arm need this.
But it seems a bit noisy so I will remove the link from v2.

> 
> > Signed-off-by: Mark-PK Tsai <mark-pk.tsai at mediatek.com>
> > ---
> >   arch/arm/vfp/vfpmodule.c | 12 ++++++++++++
> >   1 file changed, 12 insertions(+)
> > 
> > diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
> > index 7e8773a2d99d..7eab8d1019d2 100644
> > --- a/arch/arm/vfp/vfpmodule.c
> > +++ b/arch/arm/vfp/vfpmodule.c
> > @@ -788,6 +788,12 @@ static struct undef_hook neon_support_hook[] = {{
> >   	.cpsr_mask	= PSR_T_BIT,
> >   	.cpsr_val	= 0,
> >   	.fn		= vfp_support_entry,
> > +}, {
> > +	.instr_mask	= 0xffb00000,
> > +	.instr_val	= 0xfc200000,
> > +	.cpsr_mask	= PSR_T_BIT,
> > +	.cpsr_val	= 0,
> > +	.fn		= vfp_support_entry,
> >   }, {
> >   	.instr_mask	= 0xef000000,
> >   	.instr_val	= 0xef000000,
> > @@ -800,6 +806,12 @@ static struct undef_hook neon_support_hook[] = {{
> >   	.cpsr_mask	= PSR_T_BIT,
> >   	.cpsr_val	= PSR_T_BIT,
> >   	.fn		= vfp_support_entry,
> > +}, {
> > +	.instr_mask	= 0xffb00000,
> > +	.instr_val	= 0xfc200000,
> > +	.cpsr_mask	= PSR_T_BIT,
> > +	.cpsr_val	= PSR_T_BIT,
> > +	.fn		= vfp_support_entry,
> 
> Why have two entries conditional on each possible value of one bit for 
> otherwise identical encodings? Surely it suffices to set both cpsr_mask 
> and cpsr_val to 0?

You're right.
I will set both cpsr_mask and cpsr_val to 0 and use single entry,
as you suggested, in the v2 patch.

Thanks.

> 
> Thanks,
> Robin.
> 
> >   }};
> >   
> >   static struct undef_hook vfp_support_hook = {



More information about the linux-arm-kernel mailing list