[PATCH] optimize ktime_divns for constant divisors
Arnd Bergmann
arnd at arndb.de
Fri Dec 5 02:08:07 PST 2014
On Thursday 04 December 2014 23:30:08 Nicolas Pitre wrote:
> > res += (u64)x_lo * y_hi + (u64)x_hi * y_lo;
>
> That, too, risk overflowing.
>
> Let's say x_lo = 0xffffffff and x_hi = 0xffffffff. You get:
>
> 0xffffffff * 0x83126e97 -> 0x83126e967ced9169
> 0xffffffff * 0x8d4fdf3b -> 0x8d4fdf3a72b020c5
> -------------------
> 0x110624dd0ef9db22e
>
> Therefore the sum doesn't fit into a u64 variable.
>
> It is possible to skip carry handling but only when the MSB of both
> constants are zero. Here it is not the case.
If I understand this right, there are two possible optimizations to
avoid the overflow:
- for anything using monotonic time, or elapsed time, we can guarantee
that the upper bits are zero. Relying on monotonic time is a bit
dangerous, because that would mean introducing an API that works
with ktime_get() but not ktime_get_real(), and risk introducing
subtle bugs.
However, ktime_us_delta() could be optimized, and we can introduce
similar ktime_sec_delta() and ktime_ms_delta() functions with
the same properties.
- one could always pre-shift the ktime_t value. For a division by
1000, we can shift right by 3 bits first, then do the multiplication
and then do another shift. Not sure if that helps at all or if
the extra shift operation makes this counterproductive.
Arnd
More information about the linux-arm-kernel
mailing list