[PATCH v7] arm: use built-in byte swap function
Woodhouse, David
david.woodhouse at intel.com
Sun May 26 05:30:00 EDT 2013
On Sun, 2013-05-26 at 07:38 +0200, Dirk Behme wrote:
> Am 23.05.2013 18:46, schrieb Kim Phillips:
> > Enable the compiler intrinsic for byte swapping on arch ARM. This
> > allows the compiler to detect and be able to optimize out byte
> > swappings, and has a very modest benefit on vmlinux size (Linaro gcc
> > 4.8):
>
> I'm no GCC tool chain expert, so I just have an understanding
> question: Could anyone kindly give a brief explanation (*) of what the
> advantage of this is on ARM?
>
> http://comments.gmane.org/gmane.linux.kernel.cross-arch/16016
>
> mentions "lwbrx/stwbrx on PowerPC, movbe on Atom". But for ARM?
>
> I haven't understood yet why the __arch_swabXX() in
> arch/arm/include/asm/swab.h [1] aren't sufficient? How can this be
> done better? E.g. does anybody have a disassembly without/with this
> change to illustrate that?
The point is just that the compiler gets to *see* what's happening.
See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55177 for a bunch of
examples of things that GCC ought to be able to optimise, even without
the CPU having load-and-swap instructions. Not that it always does;
hence the PR. But there are some that it does currently manage,
evidently.
You'll see this if you follow the link above, but as an example: imagine
a code sequence that goes load, swap, mask, swap, store.
With the swaps done by opaque inline asm, there's nothing the compiler
can do to optimise this. But if it *knows* what's going on, it can
optimise it into a single load, mask of a pre-byte-swapped constant, and
store.
Having said that, I can't actually answer your question — I don't know
which optimisations the compiler *is* doing to provide the "modest
benefit" that Kim mentions; every class of optimisation I explicitly
checked for was missing. Again, hence the PR. But evidently it does
manage to get *something* right.
--
Sent with Evolution's ActiveSync support.
David Woodhouse Open Source Technology Centre
David.Woodhouse at intel.com Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4370 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130526/9c798d60/attachment-0001.bin>
More information about the linux-arm-kernel
mailing list