[PATCH v7] arm: use built-in byte swap function

Woodhouse, David david.woodhouse at intel.com
Sun May 26 05:30:00 EDT 2013


On Sun, 2013-05-26 at 07:38 +0200, Dirk Behme wrote:
> Am 23.05.2013 18:46, schrieb Kim Phillips:
> > Enable the compiler intrinsic for byte swapping on arch ARM.  This
> > allows the compiler to detect and be able to optimize out byte
> > swappings, and has a very modest benefit on vmlinux size (Linaro gcc
> > 4.8):
> 
> I'm no GCC tool chain expert, so I just have an understanding 
> question: Could anyone kindly give a brief explanation (*) of what the 
> advantage of this is on ARM?
> 
> http://comments.gmane.org/gmane.linux.kernel.cross-arch/16016
> 
> mentions "lwbrx/stwbrx on PowerPC, movbe on Atom". But for ARM?
> 
> I haven't understood yet why the __arch_swabXX() in 
> arch/arm/include/asm/swab.h [1] aren't sufficient? How can this be 
> done better? E.g. does anybody have a disassembly without/with this 
> change to illustrate that?

The point is just that the compiler gets to *see* what's happening.

See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55177 for a bunch of
examples of things that GCC ought to be able to optimise, even without
the CPU having load-and-swap instructions. Not that it always does;
hence the PR. But there are some that it does currently manage,
evidently.

You'll see this if you follow the link above, but as an example: imagine
a code sequence that goes load, swap, mask, swap, store.

With the swaps done by opaque inline asm, there's nothing the compiler
can do to optimise this. But if it *knows* what's going on, it can
optimise it into a single load, mask of a pre-byte-swapped constant, and
store.

Having said that, I can't actually answer your question — I don't know
which optimisations the compiler *is* doing to provide the "modest
benefit" that Kim mentions; every class of optimisation I explicitly
checked for was missing. Again, hence the PR. But evidently it does
manage to get *something* right.

-- 
                Sent with Evolution's ActiveSync support.

David Woodhouse                            Open Source Technology Centre
David.Woodhouse at intel.com                              Intel Corporation



-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4370 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130526/9c798d60/attachment-0001.bin>


More information about the linux-arm-kernel mailing list