[PATCH 08/10] ARM: OMAP: Clean-up timer posted mode support

Jon Hunter jon-hunter at ti.com
Tue Sep 11 12:26:07 EDT 2012


On 09/10/2012 07:58 PM, Tony Lindgren wrote:
> * Jon Hunter <jon-hunter at ti.com> [120910 15:00]:
>>
>> On 09/07/2012 05:22 PM, Tony Lindgren wrote:
>>> * Jon Hunter <jon-hunter at ti.com> [120905 12:05]:
>>>> The dmtimer functions to read and write the dmtimer registers are currently
>>>> defined as follows ...
>>>>
>>>> static inline u32 __omap_dm_timer_read(struct omap_dm_timer *timer, u32 reg,
>>>> 						int posted);
>>>> static inline void __omap_dm_timer_write(struct omap_dm_timer *timer,
>>>> 						u32 reg, u32 val, int posted);
>>>>
>>>> The posted variable indicates if the timer is configured to use the posted mode
>>>> when performing register accesses. The posted mode configuration of the dmtimer
>>>> is stored in the omap_dm_timer structure that is also being passed to the above
>>>> functions and therefore we do not need to pass the posted variable separately.
>>>> Therefore, simplify the above functions by removing the posted variable as an
>>>> argument as this is not necessary.
>>>
>>> I believe the reason for passing the posted flag was to optimize out some
>>> functions from the timer code as that's being run all the time.
>>>
>>> Care to check the assembly before and after this patch for the timer
>>> functions with objdump -d to make sure it does not add tons of bloat
>>> there?
>>
>> Hi Tony,
>>
>> Thanks for the details here. I see that makes sense and that the
>> compiler could take advantage of this as the functions are inlined.
>>
>> I have taken a look at the disassembled output using objdump as you
>> mentioned. What I see is ...
>>
>> 1. For dmtimer.c the impact appears negligible, the total number of
>>    lines outputted by objdump only changed by 8 with (1215 lines) and
>>    without (1207 lines) the patch applied.
>>
>> 2. For timer.c the impact is greater. I see that
>>    omap2_gp_timer_set_next_event() increased by 6 instructions from 29
>>    to 35. clocksource_read_cycles() increased by 2 instructions 15 to
>>    17 instructions. dmtimer_read_sched_clock() increased by 2
>>    instructions from 17 to 19. omap2_gp_timer_set_mode() increased by
>>    21 instructions from 102 to 123.
>>
>> I imagine that we are mainly concerned about
>> omap2_gp_timer_set_next_event(), clocksource_read_cycles() and
>> dmtimer_read_sched_clock() as these will be called often. Therefore, I
>> am not sure if you wish to drop this patch.
> 
> Well does it at lots of new ldr to the critical code?

For the omap2_gp_timer_set_next_event() function it adds 3 load
instructions, increasing the number of possible loads from 11 to 14.

For the clocksource_read_cycles() function it adds 1 load instruction,
increasing the number of possible loads from 7 to 8.

For the dmtimer_read_sched_clock() function, I don't see any additional
loads, but instructions added are a tst and beq instruction.

>> By the way, if we do drop this patch, I would then need to fix the
>> setting of the posted variable in mach-omap2/timer.c for clock-source in
>> the case where a dmtimer is used. Today the code assumes that for
>> clock-source and clock-events posted mode is always used. However, with
>> the errata i103/i767 we will disable posted mode for clock-source on
>> omap2/3/4/5/am33xx devices.
> 
> Yes I see, I guess that means just adding a new systimer entry?

Actually, I think we can avoid that by not using posted mode for
clock-source timers at all. Posted mode only benefits the clock-events
timers that are configured often.

The benefit of not using posted mode for clock-source timers and setting
the "posted" parameter to 0, will really allow the compiler to optimise
the clocksource_read_cycles() and dmtimer_read_sched_clock() quite a
bit. So this could be a nice optimisation.

Cheers
Jon



More information about the linux-arm-kernel mailing list