[PATCH v2 0/13] Zbb string optimizations and call support in alternatives
Heiko Stuebner
heiko at sntech.de
Mon Nov 28 02:26:19 PST 2022
From: Heiko Stuebner <heiko.stuebner at vrull.eu>
The Zbb extension can be used to make string functions run a lot
faster.
To allow There are essentially two problems to solve:
- making it possible for str* functions to replace what they do
in a performant way
This is done by inlining the core functions and then
using alternatives to call the actual variant.
This of course will need a more intelligent selection mechanism
down the road when more variants may exist using different
available extensions.
- actually allowing calls in alternatives
Function calls use auipc + jalr to reach those 32bit relative
addresses but when they're compiled the offset will be wrong
as alternatives live in a different section. So when the patch
gets applied the address will point to the wrong location.
So similar to arm64 the target addresses need to be updated.
This is probably also helpful for other things needing more
complex code in alternatives.
In my half-scientific test-case of running the functions in question
on a 95 character string in a loop of 10000 iterations, the Zbb
variants shave off around 2/3 of the original runtime.
For v2 I got into some sort of cleanup spree for the general instruction
parsing that already existed. A number of places do their own
instruction parsing and I tried consolidating some of them.
Noteable, the kvm parts still do, but I had to stop somewhere :-)
changes since v1:
- a number of generalizations/cleanups for instruction parsing
- use accessor function to access instructions (Emil)
- actually patch the correct location when having more than one
instruction in an alternative block
- string function cleanups (comments etc) (Conor)
- move zbb extension above s* extensions in cpu.c lists
changes since rfc:
- make Zbb code actually work
- drop some unneeded patches
- a lot of cleanups
Heiko Stuebner (13):
RISC-V: add prefix to all constants/macros in parse_asm.h
RISC-V: detach funct-values from their offset
RISC-V: add ebreak instructions to definitions
RISC-V: Move riscv_insn_is_* macros into a common header
RISC-V: rename parse_asm.h to insn.h
RISC-V: kprobes: use central defined funct3 constants
RISC-V: add auipc elements to parse_asm header
RISC-V: add U-type imm parsing to parse_asm header
RISC-V: add rd reg parsing to parse_asm header
RISC-V: fix auipc-jalr addresses in patched alternatives
efi/riscv: libstub: mark when compiling libstub
RISC-V: add infrastructure to allow different str* implementations
RISC-V: add zbb support to string functions
arch/riscv/Kconfig | 23 ++
arch/riscv/include/asm/alternative.h | 3 +
arch/riscv/include/asm/errata_list.h | 3 +-
arch/riscv/include/asm/hwcap.h | 1 +
arch/riscv/include/asm/insn.h | 292 +++++++++++++++++++++++
arch/riscv/include/asm/parse_asm.h | 219 -----------------
arch/riscv/include/asm/string.h | 83 +++++++
arch/riscv/kernel/alternative.c | 72 ++++++
arch/riscv/kernel/cpu.c | 1 +
arch/riscv/kernel/cpufeature.c | 29 ++-
arch/riscv/kernel/image-vars.h | 6 +-
arch/riscv/kernel/kgdb.c | 63 ++---
arch/riscv/kernel/probes/simulate-insn.c | 19 +-
arch/riscv/kernel/probes/simulate-insn.h | 26 +-
arch/riscv/lib/Makefile | 6 +
arch/riscv/lib/strcmp.S | 38 +++
arch/riscv/lib/strcmp_zbb.S | 96 ++++++++
arch/riscv/lib/strlen.S | 29 +++
arch/riscv/lib/strlen_zbb.S | 115 +++++++++
arch/riscv/lib/strncmp.S | 41 ++++
arch/riscv/lib/strncmp_zbb.S | 112 +++++++++
drivers/firmware/efi/libstub/Makefile | 2 +-
22 files changed, 978 insertions(+), 301 deletions(-)
create mode 100644 arch/riscv/include/asm/insn.h
delete mode 100644 arch/riscv/include/asm/parse_asm.h
create mode 100644 arch/riscv/lib/strcmp.S
create mode 100644 arch/riscv/lib/strcmp_zbb.S
create mode 100644 arch/riscv/lib/strlen.S
create mode 100644 arch/riscv/lib/strlen_zbb.S
create mode 100644 arch/riscv/lib/strncmp.S
create mode 100644 arch/riscv/lib/strncmp_zbb.S
--
2.35.1
More information about the linux-riscv
mailing list