udelay() broken for SMP cores?
Russell King - ARM Linux
linux at arm.linux.org.uk
Wed Apr 21 16:57:45 EDT 2010
On Wed, Apr 21, 2010 at 09:47:18PM +0100, Jamie Lokier wrote:
> Russell King - ARM Linux wrote:
> > You don't understand the issue. On older ARMs, the single 32-bit
> > multiply is not cheap; it shows up as having a significant time
> > expense for very short delays - and that _does_ matter.
> >
> > Consider system performance where you're driving a bus using udelay()
> > to provide 1us timings, but udelay ends up taking 10us instead every
> > time because of the calculation for number of loops for a 1us timing.
>
> Hence nested loop. You don't multiply. No calculation.
Ok, since you seem to have a clear idea how to convert this into a double
nested loop, try converting it:
@ 0 <= r0 <= 0x7fffff06
ldr r2, .LC0 (loops_per_jiffy)
ldr r2, [r2] @ max = 0x01ffffff
mov r0, r0, lsr #14 @ max = 0x0001ffff
mov r2, r2, lsr #10 @ max = 0x00007fff
mul r0, r2, r0 @ max = 2^32-1
movs r0, r0, lsr #6
moveq pc, lr
1: subs r0, r0, #1
bhi 1b
mov pc, lr
into two loops without losing the precision - note that the multiply
is part of a 'dividing by multiply+shift' technique.
More information about the linux-arm-kernel
mailing list