[PATCH] optimize ktime_divns for constant divisors

Fri Dec 5 02:08:07 PST 2014

On Thursday 04 December 2014 23:30:08 Nicolas Pitre wrote:
> >          res += (u64)x_lo * y_hi + (u64)x_hi * y_lo;
> 
> That, too, risk overflowing.
> 
> Let's say x_lo = 0xffffffff and x_hi = 0xffffffff.  You get:
> 
>         0xffffffff * 0x83126e97 ->  0x83126e967ced9169
>         0xffffffff * 0x8d4fdf3b ->  0x8d4fdf3a72b020c5
>                                    -------------------
>                                    0x110624dd0ef9db22e
> 
> Therefore the sum doesn't fit into a u64 variable.
> 
> It is possible to skip carry handling but only when the MSB of both 
> constants are zero.  Here it is not the case.

If I understand this right, there are two possible optimizations to
avoid the overflow:

- for anything using monotonic time, or elapsed time, we can guarantee
  that the upper bits are zero. Relying on monotonic time is a bit
  dangerous, because that would mean introducing an API that works
  with ktime_get() but not ktime_get_real(), and risk introducing
  subtle bugs.
  However, ktime_us_delta() could be optimized, and we can introduce
  similar ktime_sec_delta() and ktime_ms_delta() functions with
  the same properties.

- one could always pre-shift the ktime_t value. For a division by
  1000, we can shift right by 3 bits first, then do the multiplication
  and then do another shift. Not sure if that helps at all or if
  the extra shift operation makes this counterproductive.

	Arnd