[PATCH] optimize ktime_divns for constant divisors

Arnd Bergmann arnd at arndb.de
Fri Dec 5 02:08:07 PST 2014


On Thursday 04 December 2014 23:30:08 Nicolas Pitre wrote:
> >          res += (u64)x_lo * y_hi + (u64)x_hi * y_lo;
> 
> That, too, risk overflowing.
> 
> Let's say x_lo = 0xffffffff and x_hi = 0xffffffff.  You get:
> 
>         0xffffffff * 0x83126e97 ->  0x83126e967ced9169
>         0xffffffff * 0x8d4fdf3b ->  0x8d4fdf3a72b020c5
>                                    -------------------
>                                    0x110624dd0ef9db22e
> 
> Therefore the sum doesn't fit into a u64 variable.
> 
> It is possible to skip carry handling but only when the MSB of both 
> constants are zero.  Here it is not the case.

If I understand this right, there are two possible optimizations to
avoid the overflow:

- for anything using monotonic time, or elapsed time, we can guarantee
  that the upper bits are zero. Relying on monotonic time is a bit
  dangerous, because that would mean introducing an API that works
  with ktime_get() but not ktime_get_real(), and risk introducing
  subtle bugs.
  However, ktime_us_delta() could be optimized, and we can introduce
  similar ktime_sec_delta() and ktime_ms_delta() functions with
  the same properties.

- one could always pre-shift the ktime_t value. For a division by
  1000, we can shift right by 3 bits first, then do the multiplication
  and then do another shift. Not sure if that helps at all or if
  the extra shift operation makes this counterproductive.

	Arnd



More information about the linux-arm-kernel mailing list