[PATCH 3/6] arm64: lib: Implement optimized memset routine
Will Deacon
will.deacon at arm.com
Mon Dec 16 11:55:22 EST 2013
On Wed, Dec 11, 2013 at 06:24:39AM +0000, zhichang.yuan at linaro.org wrote:
> From: "zhichang.yuan" <zhichang.yuan at linaro.org>
>
> This patch, based on Linaro's Cortex Strings library, improves
> the performance of the assembly optimized memset() function.
>
> Signed-off-by: Zhichang Yuan <zhichang.yuan at linaro.org>
> Signed-off-by: Deepak Saxena <dsaxena at linaro.org>
> ---
> arch/arm64/lib/memset.S | 227 +++++++++++++++++++++++++++++++++++++++++------
> 1 file changed, 201 insertions(+), 26 deletions(-)
>
> diff --git a/arch/arm64/lib/memset.S b/arch/arm64/lib/memset.S
> index 87e4a68..90b973e 100644
> --- a/arch/arm64/lib/memset.S
> +++ b/arch/arm64/lib/memset.S
> @@ -1,13 +1,21 @@
> /*
> * Copyright (C) 2013 ARM Ltd.
> + * Copyright (C) 2013 Linaro.
> + *
> + * This code is based on glibc cortex strings work originally authored by Linaro
> + * and re-licensed under GPLv2 for the Linux kernel. The original code can
> + * be found @
> + *
> + * http://bazaar.launchpad.net/~linaro-toolchain-dev/cortex-strings/trunk/
> + * files/head:/src/aarch64/
> *
> * This program is free software; you can redistribute it and/or modify
> * it under the terms of the GNU General Public License version 2 as
> * published by the Free Software Foundation.
> *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * This program is distributed "as is" WITHOUT ANY WARRANTY of any
> + * kind, whether express or implied; without even the implied warranty
> + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
Why are you changing this?
> * GNU General Public License for more details.
> *
> * You should have received a copy of the GNU General Public License
> @@ -18,7 +26,7 @@
> #include <asm/assembler.h>
>
> /*
> - * Fill in the buffer with character c (alignment handled by the hardware)
> + * Fill in the buffer with character c
> *
> * Parameters:
> * x0 - buf
> @@ -27,27 +35,194 @@
> * Returns:
> * x0 - buf
> */
> +
> +/* By default we assume that the DC instruction can be used to zero
> +* data blocks more efficiently. In some circumstances this might be
> +* unsafe, for example in an asymmetric multiprocessor environment with
> +* different DC clear lengths (neither the upper nor lower lengths are
> +* safe to use). The feature can be disabled by defining DONT_USE_DC.
> +*/
We already use DC ZVA for clear_page, so I think we should start off using
it unconditionally. If we need to revisit this later, we can, but adding a
random #ifdef doesn't feel like something we need initially.
For the benefit of anybody else reviewing this; the DC ZVA instruction still
works for normal, non-cacheable memory.
The comments I made on the earlier patch wrt quality of comments and labels
seem to apply to all of the patches in this series.
Will
More information about the linux-arm-kernel
mailing list