[RFC PATCH 00/29] arm64: Scalable Vector Extension core support

Dave Martin Dave.Martin at arm.com
Fri Dec 2 10:21:29 PST 2016


On Fri, Dec 02, 2016 at 04:59:27PM +0000, Joseph Myers wrote:
> On Fri, 2 Dec 2016, Florian Weimer wrote:
> 
> > > However, it would be necessary to prevent GCC from moving any code
> > > across these statements -- in particular, SVE code that access VL-
> > > dependent data spilled on the stack is liable to go wrong if reordered
> > > with the above.  So the sequence would need to go in an external
> > > function (or a single asm...)
> > 
> > I would talk to GCC folks—we have similar issues with changing the FPU
> > rounding mode, I assume.
> 
> In general, GCC doesn't track the implicit uses of thread-local state 
> involved in floating-point exceptions and rounding modes, and so doesn't 
> avoid moving code across manipulations of such state; there are various 
> open bugs in this area (though many of the open bugs are for local rather 
> than global issues with code generation or local optimizations not 
> respecting exceptions and rounding modes, which are easier to fix).  Hence 
> glibc using various macros such as math_opt_barrier and math_force_eval 
> which use asms to prevent such motion.

Presumably the C language specs specify that fenv manipulations cannot
be reordered with respect to evaluation or floating-point expressions?
Sanity would seem to require this, though I've not dug into the specs
myself yet.

This doesn't get us off the hook for prctl() -- the C specs can only
define constraints on reordering for things that appear in the C spec.
prctl() is just an external function call in this context, and doesn't
enjoy the same guarantees.

> I'm not familiar enough with the optimizers to judge the right way to 
> address such issues with implicit use of thread-local state.  And I 
> haven't thought much yet about how to implement TS 18661-1 constant 
> rounding modes, which would involve the compiler implicitly inserting 
> rounding modes changes, though I think it would be fairly straightforward 
> given underlying support for avoiding inappropriate code motion.

My concern is that the compiler has no clue about what code motions are
appropriate or not with respect to a system call, beyond what applies
to a system call in general (i.e., asm volatile ( ::: "memory" ) for
GCC).

?

Cheers
---Dave



More information about the linux-arm-kernel mailing list