[PATCH v3 3/5] clk: introduce the common clock framework

Paul Walmsley paul at pwsan.com
Mon Dec 5 20:37:17 EST 2011


On Mon, 5 Dec 2011, Russell King - ARM Linux wrote:

> On Mon, Dec 05, 2011 at 02:15:56PM -0700, Paul Walmsley wrote:
>
> > But I'd propose that we instead increase the size of struct clk.rate to be 
> > s64:
> > 
> > s64 clk_round_rate(struct clk *clk, s64 desired_rate);
> > int clk_set_rate(struct clk *clk, s64 rate);
> > s64 clk_get_rate(struct clk *clk);
> > 
> > struct clk {
> > ...
> >      s64 rate;
> > ...
> > };
> > 
> > That way the clock framework can accommodate current clock rates, as well 
> > as any conceivable future clock rate.  (Some production CPUs are already 
> > running at clock rates greater than 4 GiHZ[1].  RF devices with 4 GiHz+ 
> > clock rates are also common, such as 802.11a devices running in the 5.8 
> > GHz band, and drivers for those may eventually wish to use the clock 
> > framework.)
> 
> Yuck.  You are aware that 64-bit math on 32-bit CPUs sucks? So burdening 
> _everything_ with 64-bit rate quantities is absurd.  As for making then 
> 64-bit signed quantities, that's asking for horrid code from gcc for the 
> majority of cases.

64-bit divides would be the only real issue in the clock framework 
context.  And those are easily avoided on clock hardware where the parent 
rate can never exceed 32 bits.

For example, here's a trivial implementation for rate recalculation for a 
integer divider clock node (that can't be handled with a right shift):

s64 div(struct clk *clk, u32 div) {
	if (clk->flags & CLK_PARENT_RATE_MAX_U32)
		return ((u32)(clk->parent->rate & 0xffffffff)) / div;

	clk->rate = clk->parent->rate;
	do_div(clk->rate, div);
	return clk->rate;
}

gcc generates this for ARMv6:

00000010 <div>:
  10:	e92d4038 	push	{r3, r4, r5, lr}
  14:	e1a05000 	mov	r5, r0         
  18:	e5d03010 	ldrb	r3, [r0, #16]  @ clk->flags
  1c:	e1a04001 	mov	r4, r1
  20:	e3130001 	tst	r3, #1         @ 32-bit parent rate?
  24:	1a000007 	bne	48 <div+0x38>  @ do 32-bit divide 

/* 64-bit divide by 32-bit follows */

  28:	e5903000 	ldr	r3, [r0]
  2c:	e1c300d8 	ldrd	r0, [r3, #8]
  30:	ebfffffe 	bl	0 <__do_div64>
  34:	e1a00002 	mov	r0, r2
  38:	e1a01003 	mov	r1, r3
  3c:	e5852008 	str	r2, [r5, #8]
  40:	e585300c 	str	r3, [r5, #12]
  44:	e8bd8038 	pop	{r3, r4, r5, pc}

/* 32-bit divide follows */

  48:	e5900000 	ldr	r0, [r0]
  4c:	e5900008 	ldr	r0, [r0, #8]
  50:	ebfffffe 	bl	0 <__aeabi_uidiv>
  54:	e3a01000 	mov	r1, #0
  58:	e8bd8038 	pop	{r3, r4, r5, pc}


Not bad at all, and this isn't even optimized.  (The conditional could be 
avoided completely with a little work during clock init.)  What little 
overhead there is seems quite trivial if it means that the clock framework 
will be useful for present and future devices.


- Paul



More information about the linux-arm-kernel mailing list