[PATCH] __div64_32: implement division by multiplication for 32-bit arches

Alexey Brodkin Alexey.Brodkin at synopsys.com
Thu Oct 29 07:32:23 PDT 2015


Hi Russel,

On Thu, 2015-10-29 at 13:31 +0000, Russell King - ARM Linux wrote:
> On Thu, Oct 29, 2015 at 01:47:35AM +0300, Alexey Brodkin wrote:
> > diff --git a/lib/div64.c b/lib/div64.c
> > index 62a698a..3055328 100644
> > --- a/lib/div64.c
> > +++ b/lib/div64.c
> > +/*
> > + * If the divisor happens to be constant, we determine the appropriate
> > + * inverse at compile time to turn the division into a few inline
> > + * multiplications instead which is much faster.
> > + */
> >  uint32_t __attribute__((weak)) __div64_32(uint64_t *n, uint32_t base)
> >  {
> > -	uint64_t rem = *n;
> > -	uint64_t b = base;
> > -	uint64_t res, d = 1;
> > -	uint32_t high = rem >> 32;
> > -
> > -	/* Reduce the thing a bit first */
> > -	res = 0;
> > -	if (high >= base) {
> > -		high /= base;
> > -		res = (uint64_t) high << 32;
> > -		rem -= (uint64_t) (high*base) << 32;
> > -	}
> > +	unsigned int __r, __b = base;
> >  
> > -	while ((int64_t)b > 0 && b < rem) {
> > -		b = b+b;
> > -		d = d+d;
> > -	}
> > +	if (!__builtin_constant_p(__b) || __b == 0) {
> 
> Can you explain who __builtin_constant_p(__b) can be anything but false
> here?  I can't see that this will ever be true.
> 
> This is a function in its own .c file - the compiler will have no
> knowledge about the callers of this function scattered throughout the
> kernel, and it has to assume that the 'base' argument to this function
> is variable.  So, __builtin_constant_p(__b) will always be false, which
> means this if () statement will always be true and the else clause will
> never be used.

Essentially constant propagation will only happen if __div64_32() is inlined.
For that we need to add "inline" specifier to __div64_32(), but that in its
turn will prevent use of arch-specific more optimal __div64_32() implementation.

And that was my main question how to implement this properly: have better generic
do_div() or __div64_32() as its heavy lifting part and still keep an ability for
some architectures to use their own implementations.

-Alexey


More information about the linux-snps-arc mailing list