hrtimer: one more expiry time overflow check in hrtimer_interrupt

Shinya Kuribayashi shinya.kuribayashi.px at renesas.com
Fri Jun 14 02:18:53 EDT 2013


Hello,

On 6/12/2013 9:21 PM, Prarit Bhargava wrote:
> On 06/11/2013 03:41 AM, Shinya Kuribayashi wrote:
>>   [   27.857330] hrtimer: interrupt took 0 ns
> 
> ^^^ see below ...

This may be because the frequency of our tick timer source in this
system is very slow, it's 32768 Hz.  I think it's not suitable for
the high resolution timer at all, but we had no choice...

And with this patch applied,

> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index cdd5607..bc37552 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -1371,6 +1371,8 @@ retry:
>  	tick_program_event(expires_next, 1);
>  	printk_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n",
>  		    ktime_to_ns(delta));
> +	printk_once(KERN_WARNING "# now=%lld entry_time=%lld\n", ktime_to_ns(now), ktime_to_ns(entry_time));
> +	printk_once(KERN_WARNING "# expires_next=%lld\n", ktime_to_ns(expires_next));
>  }
>  
>  /*

I've got this:

[   22.857849] hrtimer: interrupt took 0 ns
[   22.861755] # now=-4294967273343634023 entry_time=-4294967273343634023
[   22.868286] # expires_next=-4294967273343634023

now == entry_time == expires_next, and delta == 0.

This means:
1) there haven't been no tick interrupts between 'entry_time' and 'now',
2) delta can be zero,
3) expires_next = ktime_add(now, delta) doesn't move expires_next forward.

Ok, it's simply hrtimer is not supposed to work with such a slow timer
source.  However, this issue happens on other shmobile ARM systems with
more faster tick timer source (10--13MHz), see below.

>> @@ -1368,6 +1370,8 @@ retry:
>>  		expires_next = ktime_add_ns(now, 100 * NSEC_PER_MSEC);
>>  	else
>>  		expires_next = ktime_add(now, delta);
>> +	if (expires_next.tv64 < 0)
>> +		expires_next.tv64 = KTIME_MAX;
> 
> Even with this change you will still see the warning below if delta = 0.

Correct.

>>  	tick_program_event(expires_next, 1);
>>  	printk_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n",
>>  		    ktime_to_ns(delta));
> 
> So I'm not sure that this is the correct thing to do.
> 
> Is this reproducible on any ARM system?  I'll grab an x86_64 box and give it a
> shot there too.  Can you dump the values of now, delta, and expires_next when
> the printk_once triggers?

It's 100% reproducible in our shmobile ARM kernels, v3.4 and v3.10-rc2.
And it doesn't reproduce with v2.6.35-based kernels so far.

Here're logs from Cortex-A15 UP/SMP systems.  They're having ARM
architected tiemrs (the Generic Timers, running at 10--13MHz) using as
clocksource, clockevents, sched_clock, and read_current_timer (udelay)
sources.

-----------------------------------------------------------------------------
machine: R-Mobile APE6 (Cortex-A15 UP system - it's octo-core A15/A7 SoC, but not yet available in mainline)
kernel : v3.10-rc2 mainline
timer  : the Generic Timers (arch/arm/kernel/arch_timer.c) running at 13MHz
-----------------------------------------------------------------------------
|# uname -a
|Linux arm 3.10.0-rc2-00966-gd658f9e #1284 Tue Jun 11 13:36:10 JST 2013 armv7l GNU/Linux
|# cat /proc/timer_list |egrep "Clock Event Device|event_handler"
|Clock Event Device: arch_sys_timer
| event_handler:  hrtimer_interrupt
|# date -s "2038-1-19 3:14:00"
|Tue Jan 19 03:14:00 UTC 2038
|# [  138.942325] ------------[ cut here ]------------
|[  138.946910] WARNING: at kernel/time/clockevents.c:209 clockevents_program_event+0x130/0x148()
|[  138.955315] Modules linked in:
|[  138.958339] CPU: 0 PID: 0 Comm: swapper Tainted: G        W    3.10.0-rc2-00966-gd658f9e #1284
|[...]
|[  139.131970] ---[ end trace f879bec39a6bf85c ]---
|[  139.136521] hrtimer: interrupt took 2385 ns

-----------------------------------------------------------------------------
machine: R-Car H2 (quad-core Cortex-A15 SMP system)
kernel : v3.4-based custom Android kernel
timer  : the Generic Timers (arch/arm/kernel/arch_timer.c) running at 10MHz
-----------------------------------------------------------------------------
|root at renesas:~# date -s "2038-1-19 3:14:00"
|Tue Jan 19 03:14:00 GMT 2038
|root at renesas:~# [   26.458377] ------------[ cut here ]------------
|[   26.472214] WARNING: at kernel/time/clockevents.c:209 clockevents_program_event+0x3c/0x138()
|[...]
|[   26.980342] ---[ end trace 5218f4fd6603f493 ]---
|[   26.994166] hrtimer: interrupt took 1900 ns
|[   27.006687] # now=-4294967269549995552 entry_time=-4294967269549997452
|[   27.026240] # expires_next=-4294967269549993652

I also gave a try on my Ubuntu box, but it seems to work without
printk_once() at the bottome of hrtimer_interrupt() printed, which means
that hrtimer_update_base()-and-retries mechanism works as expected.

$ uname -a
Linux RI106878 3.2.0-45-generic #70-Ubuntu SMP Wed May 29 20:12:06 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

I'd like to see whether this happens on other ARM machines excpet for
shmobile.  Could someone help, please?
--
Shinya Kuribayashi
Renesas Electronics



More information about the linux-arm-kernel mailing list