[PATCH 1/1] [RFCv2] arm: add half-word __xchg

Jamie Lokier jamie at shareable.org
Sun Mar 28 10:39:26 EDT 2010


Mathieu Desnoyers wrote:
> * Jamie Lokier (jamie at shareable.org) wrote:
> > Russell King - ARM Linux wrote:
> > > I wonder if we should be using __alignof__ here.
> > > 
> > > 	unsigned long *ptrbig = (unsigned long *)((unsigned long)ptr &
> > > 		(__alignof__(unsigned long) - 1));
> > 
> > Are there ARM targets with a smaller value from __alignof__()?
> > I think you meant:
> > 
> > 	unsigned long *ptrbig = (unsigned long *)((unsigned long)ptr &
> > 		~(unsigned long)(__alignof__(unsigned long) - 1));
> > 
> > Perhaps in asm-generic that has a place.  It would simplify the asm if
> > the alignment is 1 on some machine.
> > 
> > But I don't think it'd happen anyway.  There are machines which don't
> > require full alignment of long, but insufficiently aligned *atomic*
> > accesses like cmpxchg are *not atomic*.  x86 is one such machine.
> 
> I think you mean CMPXCHG8B and CMPXCHG16B ?
> 
> If my memory serves me well, cmpxchg is fine with unaligned accesses on
> x86. But you are right in that other architectures will have a hard time
> with non-aligned cmpxchg. I'm just not sure about x86 specifically.

I'm surprised, I thought I'd read it somewhere, but you're right,
except that CMPXCHG8B also permits arbitrary alignment, as an old
Intel manual states:

    The integrity of the LOCK prefix is not affected by the alignment
    of the memory field. Memory locking is observed for arbitrarily
    misaligned fields.

I didn't check anything about CMPXCHG16.

What I'd misremembered was this:

    To improve performance of applications, AMD64 processors can
    speculatively execute instructions out of program order and
    temporarily hold out-of-order results. However, certain rules are
    followed with regard to normal cacheable accesses on naturally
    aligned boundaries to WB memory.

In other words, the x86 implicit SMP barriers are only guaranteed for
naturally aligned accesses.

ARM does requires naturally aligned atomics, and that's an issue if a
kernel is built for OABI, which has 4-byte alignment for "long long".
Fortunately it throws a noisy alignment fault, rather than being
quietly non-atomic :-)

-- Jamie



More information about the linux-arm-kernel mailing list