[PATCH v3 13/14] RISC-V: add infrastructure to allow different str* implementations
Ard Biesheuvel
ardb at kernel.org
Fri Dec 2 00:19:43 PST 2022
On Fri, 2 Dec 2022 at 05:08, Palmer Dabbelt <palmer at dabbelt.com> wrote:
>
> On Wed, 30 Nov 2022 14:56:13 PST (-0800), heiko at sntech.de wrote:
> > From: Heiko Stuebner <heiko.stuebner at vrull.eu>
> >
> > Depending on supported extensions on specific RISC-V cores,
> > optimized str* functions might make sense.
> >
> > This adds basic infrastructure to allow patching the function calls
> > via alternatives later on.
> >
> > The main idea is to have the core str* functions be inline functions
> > which then call the most optimized variant and this call then be
> > replaced via alternatives.
> >
> > The big advantage is that we don't need additional calls.
> > Though we need to duplicate the generic functions as the main code
> > expects either itself or the architecture to provide the str* functions.
> >
> > The added *_generic functions are done in assembler (taken from
> > disassembling the main-kernel functions for now) to allow us to control
> > the used registers.
> >
> > Reviewed-by: Conor Dooley <conor.dooley at microchip.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner at vrull.eu>
> > ---
> > arch/riscv/Makefile | 3 ++
> > arch/riscv/include/asm/string.h | 66 +++++++++++++++++++++++++++++++++
> > arch/riscv/kernel/image-vars.h | 6 +--
> > arch/riscv/lib/Makefile | 3 ++
> > arch/riscv/lib/strcmp.S | 38 +++++++++++++++++++
> > arch/riscv/lib/strlen.S | 29 +++++++++++++++
> > arch/riscv/lib/strncmp.S | 41 ++++++++++++++++++++
> > 7 files changed, 183 insertions(+), 3 deletions(-)
> > create mode 100644 arch/riscv/lib/strcmp.S
> > create mode 100644 arch/riscv/lib/strlen.S
> > create mode 100644 arch/riscv/lib/strncmp.S
> >
> > diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> > index 0d13b597cb55..581e4370c2a6 100644
> > --- a/arch/riscv/Makefile
> > +++ b/arch/riscv/Makefile
> > @@ -80,6 +80,9 @@ ifeq ($(CONFIG_PERF_EVENTS),y)
> > KBUILD_CFLAGS += -fno-omit-frame-pointer
> > endif
> >
> > +# strchr is special case, as gcc might want to call its own strlen from there
> > +KBUILD_CFLAGS += -fno-builtin-strlen -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strchr
>
> I was poking around Ard's comment on that -DRISCV_EFI to try and figure
> out what it was doing, but I think this is the bigger issue. I haven't
> benchmarked anything, but my guess is that turning off support for these
> builtin routines will outweigh the faster outline implementations of
> these routines. I don't have any benchmarks to prove that, but in
> general compilers are pretty smart about handling these builtin routines
> in the common cases and deferring that to runtime is probably the wrong
> option.
>
Indeed. Case in point:
riscv64-linux-gnugcc -O -S -o - -xc - <<<"int foo() { return strlen(\"bar\"); }"
gives me
li a0,3
ret
whereas this
riscv64-linux-gnu-gcc -fno-builtin-strlen -O -S -o - -xc - <<<"int
foo() { return strlen(\"bar\"); }"
gives me
.LC0:
.string "bar"
.text
.align 1
.globl foo
.type foo, @function
foo:
addi sp,sp,-16
sd ra,8(sp)
lla a0,.LC0
call strlen at plt
ld ra,8(sp)
addi sp,sp,16
jr ra
Other str* and mem* routines are optimized in similar ways when
dealing with small sizes or compile time constants.
Also, I'd recommend avoiding redefining these prototypes to static
inline as it deviates from the official prototypes in the C library.
> I haven't looked all that closely at this patch set. Is there some
> reason it's necessary to disable the builtin handling? If not then my
> guess is it's better to leave that enabled unless some benchmarks show
> otherwise (and I don't know of any Zb* hardware to test against).
>
Why not have a single generic version of each in the binary with a
patchable NOP at the start, and patch that to an optimized version
when available? That way. all of the early code can remain as is.
More information about the linux-riscv
mailing list