[PATCH v3 0/14] Zbb string optimizations and call support in alternatives
Conor Dooley
conor at kernel.org
Wed Nov 30 16:02:08 PST 2022
On 30/11/2022 22:56, Heiko Stuebner wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> From: Heiko Stuebner <heiko.stuebner at vrull.eu>
>
> The Zbb extension can be used to make string functions run a lot
> faster.
>
> To allow There are essentially two problems to solve:
> - making it possible for str* functions to replace what they do
> in a performant way
>
> This is done by inlining the core functions and then
> using alternatives to call the actual variant.
>
> This of course will need a more intelligent selection mechanism
> down the road when more variants may exist using different
> available extensions.
>
> - actually allowing calls in alternatives
> Function calls use auipc + jalr to reach those 32bit relative
> addresses but when they're compiled the offset will be wrong
> as alternatives live in a different section. So when the patch
> gets applied the address will point to the wrong location.
>
> So similar to arm64 the target addresses need to be updated.
>
> This is probably also helpful for other things needing more
> complex code in alternatives.
>
>
> In my half-scientific test-case of running the functions in question
> on a 95 character string in a loop of 10000 iterations, the Zbb
> variants shave off around 2/3 of the original runtime.
>
>
> For v2 I got into some sort of cleanup spree for the general instruction
> parsing that already existed. A number of places do their own
> instruction parsing and I tried consolidating some of them.
>
> Noteable, the kvm parts still do, but I had to stop somewhere :-)
>
> The series is based on v6.1-rc7 right now.
>
> changes since v2:
> - add patch fixing the c.jalr funct4 value
> - reword some commit messages
> - fix position of auipc addition patch (earlier)
> - fix compile errors from patch-reordering gone wrong
> (worked at the end of v2, but compiling individual patches
> caused issues) - patches are now tested individually
> - limit Zbb variants for GNU as for now
> (LLVM support for .option arch is still under review)
Still no good on that front chief:
ld.lld: error: undefined symbol: __strlen_generic
>>> referenced by ctype.c
>>> arch/riscv/purgatory/purgatory.ro:(strlcpy)
>>> referenced by ctype.c
>>> arch/riscv/purgatory/purgatory.ro:(strlcat)
>>> referenced by ctype.c
>>> arch/riscv/purgatory/purgatory.ro:(strlcat)
>>> referenced 3 more times
make[5]: *** [/stuff/linux/arch/riscv/purgatory/Makefile:85: arch/riscv/purgatory/purgatory.chk] Error 1
make[5]: Target 'arch/riscv/purgatory/' not remade because of errors.
make[4]: *** [/stuff/linux/scripts/Makefile.build:500: arch/riscv/purgatory] Error 2
allmodconfig, same toolchain as before.
> - prevent str-functions from getting optimized to builtin-variants
>
> changes since v1:
> - a number of generalizations/cleanups for instruction parsing
> - use accessor function to access instructions (Emil)
> - actually patch the correct location when having more than one
> instruction in an alternative block
> - string function cleanups (comments etc) (Conor)
> - move zbb extension above s* extensions in cpu.c lists
>
> changes since rfc:
> - make Zbb code actually work
> - drop some unneeded patches
> - a lot of cleanups
>
> Heiko Stuebner (14):
> RISC-V: fix funct4 definition for c.jalr in parse_asm.h
> RISC-V: add prefix to all constants/macros in parse_asm.h
> RISC-V: detach funct-values from their offset
> RISC-V: add ebreak instructions to definitions
> RISC-V: add auipc elements to parse_asm header
> RISC-V: Move riscv_insn_is_* macros into a common header
> RISC-V: rename parse_asm.h to insn.h
> RISC-V: kprobes: use central defined funct3 constants
> RISC-V: add U-type imm parsing to insn.h header
> RISC-V: add rd reg parsing to insn.h header
> RISC-V: fix auipc-jalr addresses in patched alternatives
> efi/riscv: libstub: mark when compiling libstub
> RISC-V: add infrastructure to allow different str* implementations
> RISC-V: add zbb support to string functions
>
> arch/riscv/Kconfig | 24 ++
> arch/riscv/Makefile | 3 +
> arch/riscv/include/asm/alternative.h | 3 +
> arch/riscv/include/asm/errata_list.h | 3 +-
> arch/riscv/include/asm/hwcap.h | 1 +
> arch/riscv/include/asm/insn.h | 292 +++++++++++++++++++++++
> arch/riscv/include/asm/parse_asm.h | 219 -----------------
> arch/riscv/include/asm/string.h | 83 +++++++
> arch/riscv/kernel/alternative.c | 72 ++++++
> arch/riscv/kernel/cpu.c | 1 +
> arch/riscv/kernel/cpufeature.c | 29 ++-
> arch/riscv/kernel/image-vars.h | 6 +-
> arch/riscv/kernel/kgdb.c | 63 ++---
> arch/riscv/kernel/probes/simulate-insn.c | 19 +-
> arch/riscv/kernel/probes/simulate-insn.h | 26 +-
> arch/riscv/lib/Makefile | 6 +
> arch/riscv/lib/strcmp.S | 38 +++
> arch/riscv/lib/strcmp_zbb.S | 96 ++++++++
> arch/riscv/lib/strlen.S | 29 +++
> arch/riscv/lib/strlen_zbb.S | 115 +++++++++
> arch/riscv/lib/strncmp.S | 41 ++++
> arch/riscv/lib/strncmp_zbb.S | 112 +++++++++
> drivers/firmware/efi/libstub/Makefile | 2 +-
> 23 files changed, 982 insertions(+), 301 deletions(-)
> create mode 100644 arch/riscv/include/asm/insn.h
> delete mode 100644 arch/riscv/include/asm/parse_asm.h
> create mode 100644 arch/riscv/lib/strcmp.S
> create mode 100644 arch/riscv/lib/strcmp_zbb.S
> create mode 100644 arch/riscv/lib/strlen.S
> create mode 100644 arch/riscv/lib/strlen_zbb.S
> create mode 100644 arch/riscv/lib/strncmp.S
> create mode 100644 arch/riscv/lib/strncmp_zbb.S
>
> --
> 2.35.1
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
More information about the linux-riscv
mailing list