[PATCH] ARM: vfp: Add vudot opcode to VFP undef hook

Xuewen Yan xuewen.yan94 at gmail.com
Thu Sep 21 05:05:22 PDT 2023


Hi Mark-PK, Robin

We also meets the scene, and has the following stack:

Thread-2 (5361): undefined instruction: pc=d05ae08c
CPU: 5 PID: 5361 Comm: Thread-2 Tainted: G        W  O
5.4.210-android12-9-04458-g56c7c43d3298-ab000045 #98
Hardware name: Generic DT based system
PC is at 0x7d1aa068
LR is at 0x7c22ae50
pc : [<7d1aa068>]    lr : [<7c22ae50>]    psr: 800b0010
sp : 7c78ee20  ip : 7c22ae40  fp : 7c78eea0
r10: 7c22ae60  r9 : 7c22ae70  r8 : 00000008
r7 : 7c0fee80  r6 : 7c0fee70  r5 : 7c0fee60  r4 : 00000010
r3 : 7c0fee50  r2 : 00000000  r1 : 00000010  r0 : a698aee0
Flags: Nzcv  IRQs on  FIQs on  Mode USER_32  ISA ARM  Segment user
Control: 10c5383d  Table: 8649c06a  DAC: 00000055
Code: edddcb1e fe22edd4 f4600a0d fc67cd98 (fe20edf4) <<<

So, we also need add the 0xfe000000:

Could you please help add the following patch into the patch-v2?

Thanks!

---
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 7e8773a2d99d..1078c0f169d2 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -788,6 +788,18 @@ static struct undef_hook neon_support_hook[] = {{
        .cpsr_mask      = PSR_T_BIT,
        .cpsr_val       = 0,
        .fn             = vfp_support_entry,
+}, {
+       .instr_mask     = 0xfc000000,
+       .instr_val      = 0xfc000000,
+       .cpsr_mask      = PSR_T_BIT,
+       .cpsr_val       = 0,
+       .fn             = vfp_support_entry,
+}, {
+       .instr_mask     = 0xfe000000,
+       .instr_val      = 0xfe000000,
+       .cpsr_mask      = PSR_T_BIT,
+       .cpsr_val       = 0,
+       .fn             = vfp_support_entry,
 }, {
        .instr_mask     = 0xef000000,
        .instr_val      = 0xef000000,
@@ -800,6 +812,18 @@ static struct undef_hook neon_support_hook[] = {{
        .cpsr_mask      = PSR_T_BIT,
        .cpsr_val       = PSR_T_BIT,
        .fn             = vfp_support_entry,
+}, {
+       .instr_mask     = 0xfc000000,
+       .instr_val      = 0xfc000000,
+       .cpsr_mask      = PSR_T_BIT,
+       .cpsr_val       = PSR_T_BIT,
+       .fn             = vfp_support_entry,
+}, {
+       .instr_mask     = 0xfe000000,
+       .instr_val      = 0xfe000000,
+       .cpsr_mask      = PSR_T_BIT,
+       .cpsr_val       = PSR_T_BIT,
+       .fn             = vfp_support_entry,
 }};

 static struct undef_hook vfp_support_hook = {

On Thu, Sep 21, 2023 at 12:40 PM Mark-PK Tsai <mark-pk.tsai at mediatek.com> wrote:
>
> > On 2023-09-20 09:39, Mark-PK Tsai wrote:
> > > Add vudot opcode to the VFP undef hook to fix the
> > > potentially undefined instruction error when the
> > > user space executes vudot instruction.
> >
> > Did the kernel expose a hwcap to say that the dot product extension is
> > supported? I'm pretty sure it didn't, so why would userspace expect this
> > to work? ;)
>
> The hwcap for dotprod has been exported since commit:
>
> 62ea0d873af3 ARM: 9269/1: vfp: Add hwcap for FEAT_DotProd
>
> >
> > IIRC Amit was looking at defining the hwcaps to align with arm64 compat,
> > but I believe that series faltered since most of them weren't actually
> > needed (and I think at that point it was still missing the VFP support
> > code parts). It would be nice if someone could pick up and combine both
>
> Were the mentioned series related to this commit?
>
> 62ea0d873af3 ARM: 9269/1: vfp: Add hwcap for FEAT_DotProd
>
> > efforts and get this done properly; fill in *all* the hwcaps and
> > relevant handling for extensions which Cortex-A55 supports (since
> > there's definitely more than just VUDOT), and then hopefully we're done
> > for good.
>
> Agree.
>
> >
> > > Before this commit, kernel didn't handle the undef exception
> > > caused by vudot and didn't enable VFP in lazy VFP context
> > > switch code like other NEON instructions.
> > > This led to the occurrence of the undefined instruction
> > > error as following:
> > >
> > > [  250.741238 ] 0904 (26902): undefined instruction: pc=004014ec
> > > ...
> > > [  250.741287 ] PC is at 0x4014ec
> > > [  250.741298 ] LR is at 0xb677874f
> > > [  250.741303 ] pc : [<004014ec>]    lr : [<b677874f>]    psr: 80070010
> > > [  250.741309 ] sp : beffedb0  ip : b67d7864  fp : beffee58
> > > [  250.741314 ] r10: 00000000  r9 : 00000000  r8 : 00000000
> > > [  250.741319 ] r7 : 00000001  r6 : 00000001  r5 : beffee90  r4 : 00401470
> > > [  250.741324 ] r3 : beffee20  r2 : beffee30  r1 : beffee40  r0 : 004003a8
> > > [  250.741331 ] Flags: Nzcv  IRQs on  FIQs on  Mode USER_32  ISA ARM Segment user
> > > [  250.741339 ] Control: 10c5383d  Table: 32d0406a  DAC: 00000055
> > > [  250.741348 ] Code: f4434aef f4610aef f4622aef f4634aef (fc620df4)
> > >
> > > Below is the assembly of the user program:
> > >
> > > 0x4014dc <+108>: vst1.64 {d20, d21}, [r3:128]
> > > 0x4014e0 <+112>: vld1.64 {d16, d17}, [r1:128]
> > > 0x4014e4 <+116>: vld1.64 {d18, d19}, [r2:128]
> > > 0x4014e8 <+120>: vld1.64 {d20, d21}, [r3:128] --> switch out
> > > 0x4014ec <+124>: vudot.u8 q8, q9, q10         <-- switch in, and FPEXC.EN = 0
> > >                                                    SIGILL(illegal instruction)
> > >
> > > Link: https://services.arm.com/support/s/case/5004L00000XsOjP
> >
> > Linking to your private support case is not useful to upstream. Even I
> > can't open that link.
>
> I thought that maybe someone in arm need this.
> But it seems a bit noisy so I will remove the link from v2.
>
> >
> > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai at mediatek.com>
> > > ---
> > >   arch/arm/vfp/vfpmodule.c | 12 ++++++++++++
> > >   1 file changed, 12 insertions(+)
> > >
> > > diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
> > > index 7e8773a2d99d..7eab8d1019d2 100644
> > > --- a/arch/arm/vfp/vfpmodule.c
> > > +++ b/arch/arm/vfp/vfpmodule.c
> > > @@ -788,6 +788,12 @@ static struct undef_hook neon_support_hook[] = {{
> > >     .cpsr_mask      = PSR_T_BIT,
> > >     .cpsr_val       = 0,
> > >     .fn             = vfp_support_entry,
> > > +}, {
> > > +   .instr_mask     = 0xffb00000,
> > > +   .instr_val      = 0xfc200000,
> > > +   .cpsr_mask      = PSR_T_BIT,
> > > +   .cpsr_val       = 0,
> > > +   .fn             = vfp_support_entry,
> > >   }, {
> > >     .instr_mask     = 0xef000000,
> > >     .instr_val      = 0xef000000,
> > > @@ -800,6 +806,12 @@ static struct undef_hook neon_support_hook[] = {{
> > >     .cpsr_mask      = PSR_T_BIT,
> > >     .cpsr_val       = PSR_T_BIT,
> > >     .fn             = vfp_support_entry,
> > > +}, {
> > > +   .instr_mask     = 0xffb00000,
> > > +   .instr_val      = 0xfc200000,
> > > +   .cpsr_mask      = PSR_T_BIT,
> > > +   .cpsr_val       = PSR_T_BIT,
> > > +   .fn             = vfp_support_entry,
> >
> > Why have two entries conditional on each possible value of one bit for
> > otherwise identical encodings? Surely it suffices to set both cpsr_mask
> > and cpsr_val to 0?
>
> You're right.
> I will set both cpsr_mask and cpsr_val to 0 and use single entry,
> as you suggested, in the v2 patch.
>
> Thanks.
>
> >
> > Thanks,
> > Robin.
> >
> > >   }};
> > >
> > >   static struct undef_hook vfp_support_hook = {



More information about the linux-arm-kernel mailing list