[RFC PATCH 00/29] arm64: Scalable Vector Extension core support
Dave Martin
Dave.Martin at arm.com
Fri Dec 2 10:21:29 PST 2016
On Fri, Dec 02, 2016 at 04:59:27PM +0000, Joseph Myers wrote:
> On Fri, 2 Dec 2016, Florian Weimer wrote:
>
> > > However, it would be necessary to prevent GCC from moving any code
> > > across these statements -- in particular, SVE code that access VL-
> > > dependent data spilled on the stack is liable to go wrong if reordered
> > > with the above. So the sequence would need to go in an external
> > > function (or a single asm...)
> >
> > I would talk to GCC folks—we have similar issues with changing the FPU
> > rounding mode, I assume.
>
> In general, GCC doesn't track the implicit uses of thread-local state
> involved in floating-point exceptions and rounding modes, and so doesn't
> avoid moving code across manipulations of such state; there are various
> open bugs in this area (though many of the open bugs are for local rather
> than global issues with code generation or local optimizations not
> respecting exceptions and rounding modes, which are easier to fix). Hence
> glibc using various macros such as math_opt_barrier and math_force_eval
> which use asms to prevent such motion.
Presumably the C language specs specify that fenv manipulations cannot
be reordered with respect to evaluation or floating-point expressions?
Sanity would seem to require this, though I've not dug into the specs
myself yet.
This doesn't get us off the hook for prctl() -- the C specs can only
define constraints on reordering for things that appear in the C spec.
prctl() is just an external function call in this context, and doesn't
enjoy the same guarantees.
> I'm not familiar enough with the optimizers to judge the right way to
> address such issues with implicit use of thread-local state. And I
> haven't thought much yet about how to implement TS 18661-1 constant
> rounding modes, which would involve the compiler implicitly inserting
> rounding modes changes, though I think it would be fairly straightforward
> given underlying support for avoiding inappropriate code motion.
My concern is that the compiler has no clue about what code motions are
appropriate or not with respect to a system call, beyond what applies
to a system call in general (i.e., asm volatile ( ::: "memory" ) for
GCC).
?
Cheers
---Dave
More information about the linux-arm-kernel
mailing list