[PATCH] ARM: vfp: Use asm volatile in fmrx/fmxr macros

Calvin Owens calvin at wbinvd.org
Mon Jul 29 09:07:30 PDT 2024


On Monday 07/29 at 09:12 +0200, Ard Biesheuvel wrote:
> On Sun, 28 Jul 2024 at 23:29, Calvin Owens <calvin at wbinvd.org> wrote:
> >
> > On Sunday 07/28 at 19:09 +0200, Ard Biesheuvel wrote:
> > > (cc Arnd, Nathan, zhuqiuer)
> > >
> > > On Sun, 28 Jul 2024 at 10:21, Calvin Owens <calvin at wbinvd.org> wrote:
> > > > <snip>
> > > >
> > > > Crudely grepping for vmsr/vmrs instructions in the otherwise nearly
> > > > idential text for vfp_support_entry() makes the problem obvious:
> > > >
> > > >     vmlinux.llvm.good [0xc0101cb8] <+48>:  vmrs   r7, fpexc
> > > >     vmlinux.llvm.good [0xc0101cd8] <+80>:  vmsr   fpexc, r0
> > > >     vmlinux.llvm.good [0xc0101d20] <+152>: vmsr   fpexc, r7
> > > >     vmlinux.llvm.good [0xc0101d38] <+176>: vmrs   r4, fpexc
> > > >     vmlinux.llvm.good [0xc0101d6c] <+228>: vmrs   r0, fpscr
> > > >     vmlinux.llvm.good [0xc0101dc4] <+316>: vmsr   fpexc, r0
> > > >     vmlinux.llvm.good [0xc0101dc8] <+320>: vmrs   r0, fpsid
> > > >     vmlinux.llvm.good [0xc0101dcc] <+324>: vmrs   r6, fpscr
> > > >     vmlinux.llvm.good [0xc0101e10] <+392>: vmrs   r10, fpinst
> > > >     vmlinux.llvm.good [0xc0101eb8] <+560>: vmrs   r10, fpinst2
> > > >
> > > >     vmlinux.llvm.bad  [0xc0101cb8] <+48>:  vmrs   r7, fpexc
> > > >     vmlinux.llvm.bad  [0xc0101cd8] <+80>:  vmsr   fpexc, r0
> > > >     vmlinux.llvm.bad  [0xc0101d20] <+152>: vmsr   fpexc, r7
> > > >     vmlinux.llvm.bad  [0xc0101d30] <+168>: vmrs   r0, fpscr
> > > >     vmlinux.llvm.bad  [0xc0101d50] <+200>: vmrs   r6, fpscr  <== BOOM!
> > > >     vmlinux.llvm.bad  [0xc0101d6c] <+228>: vmsr   fpexc, r0
> > > >     vmlinux.llvm.bad  [0xc0101d70] <+232>: vmrs   r0, fpsid
> > > >     vmlinux.llvm.bad  [0xc0101da4] <+284>: vmrs   r10, fpinst
> > > >     vmlinux.llvm.bad  [0xc0101df8] <+368>: vmrs   r4, fpexc
> > > >     vmlinux.llvm.bad  [0xc0101e5c] <+468>: vmrs   r10, fpinst2
> > > >
> > > > I think LLVM's reordering is valid as the code is currently written: the
> > > > compiler doesn't know the instructions have side effects in hardware.
> > > >
> > > > Fix by using "asm volatile" in fmxr() and fmrx(), so they cannot be
> > > > reordered with respect to each other. The original compiler now produces
> > > > working kernels on my hardware with DYNAMIC_DEBUG=n.
> > > >
> > > > This is the relevant piece of the diff of the vfp_support_entry() text,
> > > > from the original oopsing kernel to a working kernel with this patch:
> > > >
> > > >          vmrs r0, fpscr
> > > >          tst r0, #4096
> > > >          bne 0xc0101d48
> > > >          tst r0, #458752
> > > >          beq 0xc0101ecc
> > > >          orr r7, r7, #536870912
> > > >          ldr r0, [r4, #0x3c]
> > > >          mov r9, #16
> > > >         -vmrs r6, fpscr
> > > >          orr r9, r9, #251658240
> > > >          add r0, r0, #4
> > > >          str r0, [r4, #0x3c]
> > > >          mvn r0, #159
> > > >          sub r0, r0, #-1207959552
> > > >          and r0, r7, r0
> > > >          vmsr fpexc, r0
> > > >          vmrs r0, fpsid
> > > >         +vmrs r6, fpscr
> > > >          and r0, r0, #983040
> > > >          cmp r0, #65536
> > > >          bne 0xc0101d88
> > > >
> > > > Fixes: 4708fb041346 ("ARM: vfp: Reimplement VFP exception entry in C code")
> > > > Signed-off-by: Calvin Owens <calvin at wbinvd.org>
> > > > ---
> > > >  arch/arm/vfp/vfpinstr.h | 48 ++++++++++++++++++++++-------------------
> > > >  1 file changed, 26 insertions(+), 22 deletions(-)
> > > >
> > >
> > > Thanks for the patch, and for the excellent analysis.
> > >
> > > Note that this fix has been proposed in the past, as well as another
> > > one addressing the same issue, but we've been incredibly sloppy
> > > getting it merged.
> > >
> > > https://lore.kernel.org/linux-arm-kernel/20240410024126.21589-2-zhuqiuer1@huawei.com/
> > > https://lore.kernel.org/linux-arm-kernel/20240318093004.117153-2-ardb+git@google.com/
> >
> > Ah sorry for missing that, I searched for the symptom not the cure.
> >
> > > What both of us appear to have missed is that there are two versions
> > > of these routines, which should either be dropped (as they are
> > > obsolete now that the minimum binutils version is 2.25) or fixed up as
> > > well, as you do below.
> > >
> > > Anyone have any thoughts on using a memory clobber as opposed to
> > > volatile? I think volatile means that the access cannot be elided, but
> > > it is unclear to me whether that implies any ordering. A 'memory'
> > > clobber implies that globally visible state is updated, which seems
> > > like a stronger guarantee to me.
> >
> > My thinking was that if 'asm volatile' is sufficient, it will suppress
> > less optimization than the clobber, since the clobber might force the
> > compiler to assume unrelated memory could have been modified when it
> > really never is. But I'm not sure about that.
> >
> > Out of curiousity, I tried it both ways with the same compiler just now,
> > the only tiny difference in the emitted vfp_support_entry() is here:
> >
> >         --- /volatile   2024-07-28 13:28:59.890505404 -0700
> >         +++ /memclobber 2024-07-28 13:28:59.890505404 -0700
> >          str r0, [r5, #0x4]
> >          vmrs r7, fpexc
> >          tst r7, #1073741824
> >          bne 0xc0101d28
> >          orr r7, r7, #1073741824
> >          bic r0, r7, #-2147483648
> >          vmsr fpexc, r0
> >         +ldr r8, [pc, #0x26c]
> >          ldr r0, [r5, #0x8]
> >         -ldr r8, [pc, #0x268]
> >          add r6, r5, #224
> >          ldr r0, [r8, r0, lsl #2]
> >          cmp r0, r6
> >          beq 0xc0101d18
> >
> 
> Right. So it doesn't matter in practice - good to know.
> 
> So in the 'volatile' case, I guess we are relying on the compiler not
> reordering those with respect to each other, which should be
> sufficient to ensure that we do not access VFP status register before
> enabling the VFP unit via the control register, as all are accessed
> using inline asm.
> 
> In any case, this should go into the patch system.
> https://www.armlinux.org.uk/developer/patches/section.php?section=0

Done:
https://www.armlinux.org.uk/developer/patches/viewpatch.php?id=9410/1

Thanks,
Calvin

> Note to self and other: follow-up with a patch that removes
> CONFIG_AS_VFP_VMRS_FPINST, which is no longer needed.



More information about the linux-arm-kernel mailing list