Code generation involving __raw_readl and __raw_writel

Arnd Bergmann arnd at arndb.de
Thu Nov 27 03:23:10 PST 2014


On Thursday 27 November 2014 11:40:34 Mason wrote:
> Hello everyone,
> 
> Consider the following code (preprocessor output):
> 
> static int tangox_target(struct cpufreq_policy *policy, unsigned int index)
> {
>   while (__raw_readl((volatile void *)(0xf0000000 +(0x10024))) >> 31);
>   __raw_writel(0, (volatile void *)(0xf0000000 +(0x10024)));
>   return 0;
> }

First of all:
- don't use __raw_readl in driver code, use readl or readl_relaxed.
- When you do a busy-loop, add a cpu_relax().
- use proper types: 'void __iomem *', not 'volatile void *'.
- use of_iomap or devm_ioremap_resource to get to the pointer for
  a device, don't just hardcode virtual addresses.

> gcc generates the following code:
> (version and command-line in sig below)
> 
> 00000014 <tangox_target>:
>    14:   e3a03000        mov     r3, #0
>    18:   e34f3001        movt    r3, #61441      ; 0xf001
>    1c:   e3a02000        mov     r2, #0
>    20:   e34f2001        movt    r2, #61441      ; 0xf001
>    24:   e5931024        ldr     r1, [r3, #36]   ; 0x24
>    28:   e3510000        cmp     r1, #0
>    2c:   bafffffa        blt     1c <tangox_target+0x8>
>    30:   e3a00000        mov     r0, #0
>    34:   e5820024        str     r0, [r2, #36]   ; 0x24
>    38:   e12fff1e        bx      lr
> 
> Do you know why gcc duplicates the address in r2 and r3?
> And keeps putting the address in r2 over and over in the loop?
> 
> I was expecting something more along these lines:
> 
> 00000014 <tangox_target>:
>    14:   e3a03000        mov     r3, #0
>    18:   e34f3001        movt    r3, #61441      ; 0xf001
>    1c:   e5931024        ldr     r1, [r3, #36]   ; 0x24
>    20:   e3510000        cmp     r1, #0
>    24:   bafffffa        blt     1c <tangox_target+0x8>
>    28:   e3a00000        mov     r0, #0
>    2c:   e5820024        str     r0, [r3, #36]   ; 0x24
>    30:   e12fff1e        bx      lr

I suspect the use of 'volatile' just makes gcc avoid all
optimizations. Try cleaning up the code first and see if it
still happens, then use a local variable to store the __iomem
token if you have to.

	Arnd



More information about the linux-arm-kernel mailing list