udelay() broken for SMP cores?

Russell King - ARM Linux linux at arm.linux.org.uk
Wed Apr 21 16:21:15 EDT 2010


On Wed, Apr 21, 2010 at 08:52:25PM +0100, Jamie Lokier wrote:
> Russell King - ARM Linux wrote:
> > We could go to ns delays, but then we have a big problem - the cost of
> > calculating the number of loops starts to become significant compared to
> > the delays - and that's a quality of implementation factor.  In fact,
> > the existing cost has always been significant for short delays for
> > slower (sub-100MHz) ARMs.
> 
> I'm surprised it makes much difference to, say, 20MHz ARMs because you
> could structure it as a nested loop, the inner one executed once per
> microsecond and calibrated to 1us.  The delays don't have to be super
> accurate.

You don't understand the issue.  On older ARMs, the single 32-bit
multiply is not cheap; it shows up as having a significant time
expense for very short delays - and that _does_ matter.

Consider system performance where you're driving a bus using udelay()
to provide 1us timings, but udelay ends up taking 10us instead every
time because of the calculation for number of loops for a 1us timing.

> With a fixed-speed clock register known at compile time, the
> calculation tends to constant-fold nicely, even for ns delays.  Not
> suitable for multi-target kernels but good on single-target.

Here you're making a very big assumption - that there's some register
you can read which is regularly clocked.  That's not true on a lot of
older ARMs, where we struggle to satisfy sched_clock() due to lack of
such a register.



More information about the linux-arm-kernel mailing list