[PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
Shilimkar, Santosh
santosh.shilimkar at ti.com
Tue Jul 17 02:11:43 EDT 2012
On Tue, Jul 17, 2012 at 8:40 AM, Shinya Kuribayashi
<shinya.kuribayashi.px at renesas.com> wrote:
> Will, Stephen and Santosh,
>
> On 7/13/2012 8:13 PM, Will Deacon wrote:
>> I was anticipating that the platform would set the initial loops_per_jiffy
>> value if it requires udelays before loop calibration and the default of 4k
>> is wildly off.
>
> I overlooked two different lpj setups were involved at hand.
>
> First one was, the initial loops_per_jiffy value of 4k was too small for
> almost all processors running Linux today, so I set up loops_per_jiffy
> _early_, calculated from the CPU clock speed. I didn't mentioned this
> before, sorry for confusion.
>
> So my initial loops_per_jiffy is not 4k at this point. It's optimized
> for loop-based delay with the CPU running at 1.2GHz (much bigger than
> default 4k).
>
> And later, init_current_timer_delay() got processed. Actual udelay()
> behavior switched from loop-based delay to timer-based one immediately,
> while my loops_per_jiffy was not changed/updated to appropriate value.
>
> This is why my udelay()s, used after init_current_timer_delay(), were
> taking considerable long time to expire. Note that my initial tests
> for Will's patchset was done using a loadable module dedicated for
> udelay tests, that was prepared for 2.6.35/3.0 kernels beforehand.
>
> And this time, I confirmed that updating loops_per_jiffy at the same
> time as lpj_fine, works perfectly as expected for me.
>
>> If people need loops_per_jiffy to be updated at the same time as lpj_fine,
>> I can post that as a separate patch (below) as Russell has merged v2 of these
>> patches into his delay branch. That said, I'd certainly like to know if this
>> is actually a real problem (and can't be solved by choosing a compromise value
>> as the initial loops_per_jiffy). I think Shinya was doing some tests so
>> I'll wait to see how those went.
>
> From my observations:
>
> (1) loops_per_jiffy can easily be calculated from the CPU clock speed.
> If your platform is capable of detecting CPU frequency at run-time,
> settingi up loops_per_jiffy _early_ can allow you early use of udelay()s.
>
> Or even if you don't need udelay() early, setting up lpj_fine (or having
> calibrate_delay_is_known()) allows you to skip calibrate_delay() later.
> This is useful and can be applied to both UP and SMP systems.
>
> (2) For SMP platforms, if you need ealy use of udelay(), you have to
> update loops_per_jiffy at the same time as init_current_timer_delay().
> It could be done in init_current_timer_delay(), or platforms can take
> care of that, that need udelay() available early. Either one should be
> fine with me.
Thanks for the detailed explanation. CPU clock detection is indeed the
nit way to skip the calibration overhead and this was one of the comment
when I tried to push the skipping of calibration for secondary CPUs.
Looks like you have a working patch for the clock detection. Will
you able to post that patch so that this long pending calibration
for secondary CPUs gets optimized.
Regards
Santosh
More information about the linux-arm-kernel
mailing list