perf tools build broken for RISCV 32 bit

Arnd Bergmann arnd at kernel.org
Tue Jan 19 14:47:59 EST 2021


On Tue, Jan 19, 2021 at 5:53 PM Emiliano Ingrassia
<ingrassia at epigenesys.com> wrote:
> On Mon, Jan 18, 2021 at 08:58:50PM +0100, Arnd Bergmann wrote:
> > On Mon, Jan 18, 2021 at 4:33 PM Emiliano Ingrassia
> > <ingrassia at epigenesys.com> wrote:
> > > On Mon, Jan 18, 2021 at 04:19:47PM +0100, Arnd Bergmann wrote:
> > > > On Mon, Jan 18, 2021 at 3:53 PM Emiliano Ingrassia <ingrassia at epigenesys.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > after some searching I found this patch from Yocto project:
> > > > >
> > > > > https://www.mail-archive.com/linux-yocto@lists.yoctoproject.org/msg01004.html.
> > > > >
> > > > > I don't know if it's ever been proposed as mainline patch.
> > > > >
> > > > > What do you think about it? Could it be the right solution?
> > > >
> > > > No. While this makes it work on rv32, it does not fix it in a
> > > > way that makes it work on other architectures with a time64
> > > > libc. If you fix it, please do it in a way that works on all architectures.
> > >
> > > Ok, I totally agree with you on this point.
> > >
> > > So, it is right to conclude that the problem is not in perf
> > > implementation but in the libc implementation?
> > > That is, for example, glibc should define SYS_futex as SYS_futex_time64
> > > in case of riscv 32 bit because of its support to 64 bit time_t?
> >
> > Actually, there might be a somewhat less ugly workaround, if we decide
> > to put a correct wrapper into the exported kernel headers, e.g. adding
> > something along these lines to include/uapi/linux/futex.h or a new header
> > from the kernel:
> >
> > #ifndef __KERNEL__
> > #include <linux/futex.h>
> > #include <linux/time_types.h>
> > #include <asm/unistd.h>
> > #include <sys/time.h>
> > #include <unistd.h>
> > #include <errno.h>
> >
> > #define __kernel_futex __kernel_futex
> > typedef union __attribute__ ((__transparent_union__)) {
> >        const struct timespec *timeout;
> >        unsigned int intval;
> > } __kernel_futex_union;
> >
> > struct __kernel_old_timespec {
> >        __kernel_time_t tv_sec;                 /* seconds */
> >        long            tv_nsec;                /* nanoseconds */
> > };
> >
> > static inline int __kernel_futex(int *uaddr, int op, int val,
> >                                  __kernel_futex_union val2,
> >                                  unsigned *uaddr2, int val3)
> > {
> >         int cmd = op & FUTEX_CMD_MASK;
> >         int ret = -ENOSYS;
> > #ifdef __NR_futex64
> >         {
> >                 if (((cmd != FUTEX_WAIT) && (cmd != FUTEX_WAIT_BITSET)) ||
> >                     (sizeof(time_t) == sizeof(__u64))) {
> >                         ret = syscall(__NR_futex64, uaddr, op, val,
> >                                       val2.timeout, uaddr2, val3);
> >                 }
> >
> >                 if (ret != -ENOSYS && ret != -EPERM)
> >                         return ret;
> >         }
> > #endif
> > #ifdef __NR_futex
> >         {
> >                 if (((cmd != FUTEX_WAIT) && (cmd != FUTEX_WAIT_BITSET)) ||
> >                     sizeof(time_t) == sizeof(__kernel_long_t)) {
> >                         ret = syscall(__NR_futex, uaddr, op, val,
> >                                        val2.timeout, uaddr2, val3);
> >                 } else {
> >                         struct __kernel_old_timespec oldtimeout = {
> >                                 .tv_sec  = val2.timeout->tv_sec,
> >                                 .tv_nsec = val2.timeout->tv_nsec,
> >                         };
> >                         ret = syscall(__NR_futex, uaddr, op, val,
> >                                       &oldtimeout, uaddr2, val3);
> >                 }
> >         }
> > #endif
> >         return ret;
> > }
> > #endif /* __KERNEL__
> >
> > With this, applications can be changed to define their own
> >
> > #ifdef __kernel_futex
> > #define futex(uaddr, futex_op, val, val2, uaddr2, val3) \
> >       __kernel_futex(uaddr, futex_op, val, val2, uaddr2, val3)
> > #else
> > #define futex(uaddr, futex_op, val, val2, uaddr2, val3) \
> >        syscall(__NR_futex, uaddr, futex_op, val, val2, uaddr2, val3)
> > #endif
> >
> >          Arnd
>
> I would like to do a recap on what we have so far.
>
> The goal is to compile perf for RISCV 32 bit with glibc 2.32 which has support
> for 64 bit time_t. The target kernel is 5.10.7 with option COMPAT_32BIT_TIME
> enabled.

I didn't think that COMPAT_32BIT_TIME was even allowed on rv32,
but it seems there is a bug in Kconfig, it should be disabled for any
architecture that does not set __ARCH_WANT_TIME32_SYSCALLS,
but this should not affect the problem at hand.

> The userspace header file bits/syscall.h contains the needed definitions:
> |...
> |#ifdef __NR_futex
> |# define SYS_futex __NR_futex
> |#endif
> |
> |#ifdef __NR_futex_time64
> |# define SYS_futex_time64 __NR_futex_time64
> |#endif
> |...
>
> In case of architecture without __NR_futex definition nor futex() wrapper,
> the syscall function should be called with SYS_futex_time64 instead of the
> classic SYS_futex. This is not backward compatible and also not documented
> as far as I know.
>
> How this problem is solved in real scenarios? For example how can pthread
> library, which uses futex syscall, be compiled correctly in glibc?

The pthread library is built as part of glibc, and it uses its own local
(architecture specific) workaround.

> The solution is quite simple, glibc has a file called sysdep.h which in case of
> RISCV 32 bit contains the following workarounds:
> |...
> |#if __WORDSIZE == 32
> |
> |/* Workarounds for generic code needing to handle 64-bit time_t.  */
> |
> |/* Fix sysdeps/unix/sysv/linux/clock_getcpuclockid.c.  */
> |#define __NR_clock_getres       __NR_clock_getres_time64
> |/* Fix sysdeps/nptl/lowlevellock-futex.h.  */
> |#define __NR_futex              __NR_futex_time64
> |...
>
> So, in that case, __NR_futex is defined as __NR_futex_time64 and so SYS_futex is
> defined and usable.
>
> But this file can not be seen and used by userspace application.

Right, this is intentional, redefining __NR_futex would break any
correctly written application that passes a __kernel_old_timespec
structure when using __NR_futex.

> So, what should be the right solution here?
>
> Should the kernel export only __NR_futex and then check the timer struct
> dimension and distinguish the various cases (time32 vs time64 futex)?
>
> Or should the libc implementation take care to define SYS_futex once and for all
> and then use the proper __NR_futex{_time64} based on its definition of time_t?
>
> Considering that this is not only a problem of futex but of many others syscalls
> (e.g. __NR_clock_gettime{64}, __NR_clock_settime{64}, and so on), probably
> I would argue that the problem should be solved in the libc implementation
> in some way.
>
> What do you think?

The fundamental problem is mixing glibc interfaces with kernel interfaces,
which is broken in many ways. You would hit a similar issue when using
__NR_stat with 'struct stat' from glibc instead of 'struct __old_kernel_stat'.

Ideally, an application would just use the kernel types when calling
the kernel interfaces, or use the libc types and call a libc-provided
function (which glibc does not provide for futex).

There is nothing that glibc or kernel could do here to fix existing
applications, they all have to be changed manually to use a correct
interface when built against a time64 glibc or musl but directly
calling low-level syscall interfaces including but not limited to futex.

rv32 is in the unfortunate position of being the first one that has a glibc
port, so it hits a lot of problems that other architectures do not.
If the applications get fixed properly, then at least they will also
work with musl-1.2 and a (will work-in-progress) glibc that will
eventually allow building with 64-bit time_t.

One thing that glibc can do to help would be to #undef all the
time32 SYS_* macros when 64-bit time_t is used, to force a
build error on broken applications that try to use the wrong types,
but this is also problematic since you sometimes want to use
the numbers for something other than calling syscall(), e.g.
in strace or seccomp.

      Arnd



More information about the linux-riscv mailing list