[PATCH 3/4] ARM: atomic ops: add memory constraints to inline asm

Will Deacon will.deacon at arm.com
Thu Jul 8 05:36:29 EDT 2010

Hi Nicolas,

> > Currently, the 32-bit and 64-bit atomic operations on ARM do not
> > include memory constraints in the inline assembly blocks. In the
> > case of barrier-less operations [for example, atomic_add], this
> > means that the compiler may constant fold values which have actually
> > been modified by a call to an atomic operation.

Thanks a lot for looking at this.

> Why do you use the "o" constraint?  That's for an offsetable
> memory reference.  Since we already use the actual address value with a
> register constraint, it would be more logical to simply use the "Q"
> constraint alone without any "o".  The gcc manual says:
> |    `Q'
> |          A memory reference where the exact address is in a single
> |          register
> Both "Qo" and "Q" provide the same wanted end result in this
> case.  But a quick test shows that "Q" produces exactly what we need,
> more so than "Qo", because of the other operand which is the actual
> address (the same register is used in both cases).

Whilst using "Q" on its own does generate correct code, using "Qo" is
a slight optimisation. The issue with "Q" is that GCC computes the address
again, even though it has already done so for the "r" constraint. Ideally,
we'd ditch the "r" constraint and just use "Q", but unfortunately this results
in code like ldrex r0, [r1, #0] which GAS refuses to accept. If we use "o",
then GCC doesn't compute the address twice, but can fail if it ends up with
a non-offsettable address (complaining that the constraints are impossible
to satisfy). Using "Qo" results in "o" being used if possible but, where
it doesn't match, "Q" is used at the expense of an extra register:

With "Q":

 740:   f57ff05f        dmb     sy
 744:   e30f4001        movw    r4, #61441      ; 0xf001
 748:   e30a5bad        movw    r5, #43949      ; 0xabad
 74c:   e34f400d        movt    r4, #61453      ; 0xf00d
 750:   e34f5ace        movt    r5, #64206      ; 0xface
 754:   e24be034        sub     lr, fp, #52     ; 0x34    <--- Redundant address computation
 758:   e3a02000        mov     r2, #0
 75c:   e1b36f9f        ldrexd  r6, [r3]
 760:   e1360004        teq     r6, r4
 764:   01370005        teqeq   r7, r5
 768:   01a32f90        strexdeq        r2, r0, [r3]
 76c:   e3520000        cmp     r2, #0
 770:   1afffff7        bne     754 <test_atomic64+0x754>
 774:   f57ff05f        dmb     sy

With "Qo":

 6f4:   f57ff05f        dmb     sy
 6f8:   e30f4001        movw    r4, #61441      ; 0xf001
 6fc:   e30a5bad        movw    r5, #43949      ; 0xabad
 700:   e34f400d        movt    r4, #61453      ; 0xf00d
 704:   e34f5ace        movt    r5, #64206      ; 0xface
 708:   e3a02000        mov     r2, #0
 70c:   e1b36f9f        ldrexd  r6, [r3]
 710:   e1360004        teq     r6, r4
 714:   01370005        teqeq   r7, r5
 718:   01a32f90        strexdeq        r2, r0, [r3]
 71c:   e3520000        cmp     r2, #0
 720:   1afffff8        bne     708 <test_atomic64+0x708>
 724:   f57ff05f        dmb     sy
> In any case:
> Reviewed-by: Nicolas Pitre <nicolas.pitre at linaro.org>

Thanks, I'll submit this to the patch system today.

> Also, I'm assuming that such a constraint should be added to the 32 bit
> versions too for the same correctness reason?

That's right. The patch updates the 32-bit variants as well as the
64-bit ones.



More information about the linux-arm-kernel mailing list