[PATCH v6] arm: use built-in byte swap function

Nicolas Pitre nico at fluxnic.net
Fri Feb 22 21:40:17 EST 2013


On Fri, 22 Feb 2013, Kim Phillips wrote:

> On Thu, 21 Feb 2013 22:40:08 -0500
> Nicolas Pitre <nico at fluxnic.net> wrote:
> 
> > On Thu, 21 Feb 2013, Kim Phillips wrote:
> > 
> > > Here's the asm version I'm working on now, based on compiler
> > > output of the C version.  Haven't tested beyond defconfig builds,
> > > which pass ok.
> > > 
> > > Is there anything I have to do for thumb mode?  If so, how to test?
> > 
> > You just need to pick a config that uses some ARMv7 processor, and 
> > enable CONFIG_THUMB2_KERNEL.  I don't see any problem with your patch 
> > wrt Thumb2.
> 
> ok, I've addressed your comments and tested both pre-armv6 and armv6
> + bswapsi2s on i.mx hardware with CONFIG_CC_OPTIMIZE_FOR_SIZE and
> CONFIG_THUMB2_KERNEL set:
> 
> >From c22f4050174d8da71fdddba2cf67ae40c00ca5cc Mon Sep 17 00:00:00 2001
> From: Kim Phillips <kim.phillips at freescale.com>
> Date: Tue, 19 Feb 2013 17:16:11 -0600
> Subject: [PATCH] arm: use built-in byte swap function
> 
> Enable the compiler intrinsic for byte swapping on arch ARM.  This
> allows the compiler to detect and be able to optimize out byte
> swappings, and has a tiny benefit on vmlinux size (Linaro gcc 4.7.3):
> 
>    text	   data	    bss	    dec	    hex	filename
> 2754100	 121144	  56520	2931764	 2cbc34	vmlinux-lart #orig
> 2754050	 121144	  56520	2931714	 2cbc02	vmlinux-lart #builtin-bswap
> 6282699	 307852	5578076	12168627 b9adb3	vmlinux-mxs #orig
> 6282241	 307832	5578076	12168149 b9abd5	vmlinux-mxs #builtin-bswap
> 7200193	 364180	 361748	7926121	 78f169	vmlinux-imx_v6_v7 #orig
> 7199515	 364188	 361748	7925451	 78eecb	vmlinux-imx_v6_v7 #builtin-bswap
> 
> Signed-off-by: Kim Phillips <kim.phillips at freescale.com>

Reviewed-by: Nicolas Pitre <nico at linaro.org>


> ---
> akin to: http://comments.gmane.org/gmane.linux.kernel.cross-arch/16016
> 
> based on linux-next-20130221.
> 
> changes from last diff:
> - addressed Nicolas' comments
> - updated commit text figures and reformatted as a patch
> 
> changes from diff before that:
> - 1st asm version
> 
> changes from diff before that:
> - enforce -O2 for bswapsdi2.o
> - fix building out-of-source tree
> 
> changes from diff before that:
> - implement custom __bswap[sd]i2 in arch/arm/lib/bswapsdi2.c
> 
> v5: re-work based on new gcc version test data:
>   - moved outside armv6 protection
>   - check for gcc 4.6+ demoted to gcc 4.5+ with:
>     !defined(CONFIG_CC_OPTIMIZE_FOR_SIZE)
> 
> v4:
> - undo v2-2's addition of ARCH_DEFINES_BUILTIN_BSWAP per Boris
>   and David - object is to find arches that define _HAVE_BSWAP
>   and clean it up in the future: patch is much less intrusive. :)
> 
> v3:
> - moved out of uapi swab.h into arch/arm/include/asm/swab.h
> - moved ARCH_DEFINES_BUILTIN_BSWAP help text into commit message
> - moved GCC_VERSION >= 40800 ifdef into GCC_VERSION >= 40600 block
> 
> v2:
> - at91 and lpd270 builds fixed by limiting to ARMv6 and above
>   (i.e., ARM cores that have support for the 'rev' instruction).
>   Otherwise, the compiler emits calls to libgcc's __bswapsi2 on
>   these ARMv4/v5 builds (and arch ARM doesn't link with libgcc).
>   All ARM defconfigs now have the same build status as they did
>   without this patch (some are broken on linux-next).
> 
> - move ARM check from generic compiler.h to arch ARM's swab.h.
>   - pretty sure it should be limited to __KERNEL__ builds
> 
> - add new ARCH_DEFINES_BUILTIN_BSWAP (see Kconfig help).
>   - if set, generic compiler header does not set HAVE_BUILTIN_BSWAPxx
>   - not too sure about this having to be a new CONFIG_, but it's hard
>     to find a place for it given linux/compiler.h doesn't include any
>     arch-specific files.
> 
> - move new selects to end of CONFIG_ARM's Kconfig select list,
>   as is done in David Woodhouse's original patchseries for ppc/x86.
> 
>  arch/arm/Kconfig                  |  1 +
>  arch/arm/boot/compressed/Makefile | 15 +++++++++++----
>  arch/arm/kernel/armksyms.c        |  4 ++++
>  arch/arm/lib/Makefile             |  2 +-
>  arch/arm/lib/bswapsdi2.S          | 36 ++++++++++++++++++++++++++++++++++++
>  5 files changed, 53 insertions(+), 5 deletions(-)
>  create mode 100644 arch/arm/lib/bswapsdi2.S
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index dedf02b..e8a41d0 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -59,6 +59,7 @@ config ARM
>  	select CLONE_BACKWARDS
>  	select OLD_SIGSUSPEND3
>  	select OLD_SIGACTION
> +	select ARCH_USE_BUILTIN_BSWAP
>  	help
>  	  The ARM series is a line of low-power-consumption RISC chip designs
>  	  licensed by ARM Ltd and targeted at embedded applications and
> diff --git a/arch/arm/boot/compressed/Makefile b/arch/arm/boot/compressed/Makefile
> index 5cad8a6..d9b5ee5 100644
> --- a/arch/arm/boot/compressed/Makefile
> +++ b/arch/arm/boot/compressed/Makefile
> @@ -108,12 +108,12 @@ endif
>  
>  targets       := vmlinux vmlinux.lds \
>  		 piggy.$(suffix_y) piggy.$(suffix_y).o \
> -		 lib1funcs.o lib1funcs.S ashldi3.o ashldi3.S \
> -		 font.o font.c head.o misc.o $(OBJS)
> +		 lib1funcs.o lib1funcs.S ashldi3.o ashldi3.S bswapsdi2.o \
> +		 bswapsdi2.S font.o font.c head.o misc.o $(OBJS)
>  
>  # Make sure files are removed during clean
>  extra-y       += piggy.gzip piggy.lzo piggy.lzma piggy.xzkern \
> -		 lib1funcs.S ashldi3.S $(libfdt) $(libfdt_hdrs)
> +		 lib1funcs.S ashldi3.S bswapsdi2.S $(libfdt) $(libfdt_hdrs)
>  
>  ifeq ($(CONFIG_FUNCTION_TRACER),y)
>  ORIG_CFLAGS := $(KBUILD_CFLAGS)
> @@ -155,6 +155,12 @@ ashldi3 = $(obj)/ashldi3.o
>  $(obj)/ashldi3.S: $(srctree)/arch/$(SRCARCH)/lib/ashldi3.S
>  	$(call cmd,shipped)
>  
> +# For __bswapsi2, __bswapdi2
> +bswapsdi2 = $(obj)/bswapsdi2.o
> +
> +$(obj)/bswapsdi2.S: $(srctree)/arch/$(SRCARCH)/lib/bswapsdi2.S
> +	$(call cmd,shipped)
> +
>  # We need to prevent any GOTOFF relocs being used with references
>  # to symbols in the .bss section since we cannot relocate them
>  # independently from the rest at run time.  This can be achieved by
> @@ -176,7 +182,8 @@ if [ $(words $(ZRELADDR)) -gt 1 -a "$(CONFIG_AUTO_ZRELADDR)" = "" ]; then \
>  fi
>  
>  $(obj)/vmlinux: $(obj)/vmlinux.lds $(obj)/$(HEAD) $(obj)/piggy.$(suffix_y).o \
> -		$(addprefix $(obj)/, $(OBJS)) $(lib1funcs) $(ashldi3) FORCE
> +		$(addprefix $(obj)/, $(OBJS)) $(lib1funcs) $(ashldi3) \
> +		$(bswapsdi2) FORCE
>  	@$(check_for_multiple_zreladdr)
>  	$(call if_changed,ld)
>  	@$(check_for_bad_syms)
> diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
> index 60d3b73..ba578f7 100644
> --- a/arch/arm/kernel/armksyms.c
> +++ b/arch/arm/kernel/armksyms.c
> @@ -35,6 +35,8 @@ extern void __ucmpdi2(void);
>  extern void __udivsi3(void);
>  extern void __umodsi3(void);
>  extern void __do_div64(void);
> +extern void __bswapsi2(void);
> +extern void __bswapdi2(void);
>  
>  extern void __aeabi_idiv(void);
>  extern void __aeabi_idivmod(void);
> @@ -114,6 +116,8 @@ EXPORT_SYMBOL(__ucmpdi2);
>  EXPORT_SYMBOL(__udivsi3);
>  EXPORT_SYMBOL(__umodsi3);
>  EXPORT_SYMBOL(__do_div64);
> +EXPORT_SYMBOL(__bswapsi2);
> +EXPORT_SYMBOL(__bswapdi2);
>  
>  #ifdef CONFIG_AEABI
>  EXPORT_SYMBOL(__aeabi_idiv);
> diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> index af72969..5383df7 100644
> --- a/arch/arm/lib/Makefile
> +++ b/arch/arm/lib/Makefile
> @@ -13,7 +13,7 @@ lib-y		:= backtrace.o changebit.o csumipv6.o csumpartial.o   \
>  		   ashldi3.o ashrdi3.o lshrdi3.o muldi3.o             \
>  		   ucmpdi2.o lib1funcs.o div64.o                      \
>  		   io-readsb.o io-writesb.o io-readsl.o io-writesl.o  \
> -		   call_with_stack.o
> +		   call_with_stack.o bswapsdi2.o
>  
>  mmu-y	:= clear_user.o copy_page.o getuser.o putuser.o
>  
> diff --git a/arch/arm/lib/bswapsdi2.S b/arch/arm/lib/bswapsdi2.S
> new file mode 100644
> index 0000000..2ba43a0
> --- /dev/null
> +++ b/arch/arm/lib/bswapsdi2.S
> @@ -0,0 +1,36 @@
> +#include <linux/linkage.h>
> +
> +#if __LINUX_ARM_ARCH__ >= 6
> +ENTRY(__bswapsi2)
> +	rev	r0, r0
> +	bx	lr
> +ENDPROC(__bswapsi2)
> +
> +ENTRY(__bswapdi2)
> +	rev	r3, r0
> +	rev	r0, r1
> +	mov	r1, r3
> +	bx	lr
> +ENDPROC(__bswapdi2)
> +#else
> +ENTRY(__bswapsi2)
> +	eor     r3, r0, r0, ror #16
> +	mov     r3, r3, lsr #8
> +	bic     r3, r3, #0xff00
> +	eor     r0, r3, r0, ror #8
> +	mov     pc, lr
> +ENDPROC(__bswapsi2)
> +
> +ENTRY(__bswapdi2)
> +	mov     ip, r1
> +	eor     r3, ip, ip, ror #16
> +	eor     r1, r0, r0, ror #16
> +	mov     r1, r1, lsr #8
> +	mov     r3, r3, lsr #8
> +	bic     r3, r3, #0xff00
> +	bic     r1, r1, #0xff00
> +	eor     r1, r1, r0, ror #8
> +	eor     r0, r3, ip, ror #8
> +	mov     pc, lr
> +ENDPROC(__bswapdi2)
> +#endif
> -- 
> 1.8.1.4
> 
> 



More information about the linux-arm-kernel mailing list