On 30/11/2022 22:56, Heiko Stuebner wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe > > From: Heiko Stuebner <heiko.stuebner@xxxxxxxx> > > The Zbb extension can be used to make string functions run a lot > faster. > > To allow There are essentially two problems to solve: > - making it possible for str* functions to replace what they do > in a performant way > > This is done by inlining the core functions and then > using alternatives to call the actual variant. > > This of course will need a more intelligent selection mechanism > down the road when more variants may exist using different > available extensions. > > - actually allowing calls in alternatives > Function calls use auipc + jalr to reach those 32bit relative > addresses but when they're compiled the offset will be wrong > as alternatives live in a different section. So when the patch > gets applied the address will point to the wrong location. > > So similar to arm64 the target addresses need to be updated. > > This is probably also helpful for other things needing more > complex code in alternatives. > > > In my half-scientific test-case of running the functions in question > on a 95 character string in a loop of 10000 iterations, the Zbb > variants shave off around 2/3 of the original runtime. > > > For v2 I got into some sort of cleanup spree for the general instruction > parsing that already existed. A number of places do their own > instruction parsing and I tried consolidating some of them. > > Noteable, the kvm parts still do, but I had to stop somewhere :-) > > The series is based on v6.1-rc7 right now. > > changes since v2: > - add patch fixing the c.jalr funct4 value > - reword some commit messages > - fix position of auipc addition patch (earlier) > - fix compile errors from patch-reordering gone wrong > (worked at the end of v2, but compiling individual patches > caused issues) - patches are now tested individually > - limit Zbb variants for GNU as for now > (LLVM support for .option arch is still under review) Still no good on that front chief: ld.lld: error: undefined symbol: __strlen_generic >>> referenced by ctype.c >>> arch/riscv/purgatory/purgatory.ro:(strlcpy) >>> referenced by ctype.c >>> arch/riscv/purgatory/purgatory.ro:(strlcat) >>> referenced by ctype.c >>> arch/riscv/purgatory/purgatory.ro:(strlcat) >>> referenced 3 more times make[5]: *** [/stuff/linux/arch/riscv/purgatory/Makefile:85: arch/riscv/purgatory/purgatory.chk] Error 1 make[5]: Target 'arch/riscv/purgatory/' not remade because of errors. make[4]: *** [/stuff/linux/scripts/Makefile.build:500: arch/riscv/purgatory] Error 2 allmodconfig, same toolchain as before. > - prevent str-functions from getting optimized to builtin-variants > > changes since v1: > - a number of generalizations/cleanups for instruction parsing > - use accessor function to access instructions (Emil) > - actually patch the correct location when having more than one > instruction in an alternative block > - string function cleanups (comments etc) (Conor) > - move zbb extension above s* extensions in cpu.c lists > > changes since rfc: > - make Zbb code actually work > - drop some unneeded patches > - a lot of cleanups > > Heiko Stuebner (14): > RISC-V: fix funct4 definition for c.jalr in parse_asm.h > RISC-V: add prefix to all constants/macros in parse_asm.h > RISC-V: detach funct-values from their offset > RISC-V: add ebreak instructions to definitions > RISC-V: add auipc elements to parse_asm header > RISC-V: Move riscv_insn_is_* macros into a common header > RISC-V: rename parse_asm.h to insn.h > RISC-V: kprobes: use central defined funct3 constants > RISC-V: add U-type imm parsing to insn.h header > RISC-V: add rd reg parsing to insn.h header > RISC-V: fix auipc-jalr addresses in patched alternatives > efi/riscv: libstub: mark when compiling libstub > RISC-V: add infrastructure to allow different str* implementations > RISC-V: add zbb support to string functions > > arch/riscv/Kconfig | 24 ++ > arch/riscv/Makefile | 3 + > arch/riscv/include/asm/alternative.h | 3 + > arch/riscv/include/asm/errata_list.h | 3 +- > arch/riscv/include/asm/hwcap.h | 1 + > arch/riscv/include/asm/insn.h | 292 +++++++++++++++++++++++ > arch/riscv/include/asm/parse_asm.h | 219 ----------------- > arch/riscv/include/asm/string.h | 83 +++++++ > arch/riscv/kernel/alternative.c | 72 ++++++ > arch/riscv/kernel/cpu.c | 1 + > arch/riscv/kernel/cpufeature.c | 29 ++- > arch/riscv/kernel/image-vars.h | 6 +- > arch/riscv/kernel/kgdb.c | 63 ++--- > arch/riscv/kernel/probes/simulate-insn.c | 19 +- > arch/riscv/kernel/probes/simulate-insn.h | 26 +- > arch/riscv/lib/Makefile | 6 + > arch/riscv/lib/strcmp.S | 38 +++ > arch/riscv/lib/strcmp_zbb.S | 96 ++++++++ > arch/riscv/lib/strlen.S | 29 +++ > arch/riscv/lib/strlen_zbb.S | 115 +++++++++ > arch/riscv/lib/strncmp.S | 41 ++++ > arch/riscv/lib/strncmp_zbb.S | 112 +++++++++ > drivers/firmware/efi/libstub/Makefile | 2 +- > 23 files changed, 982 insertions(+), 301 deletions(-) > create mode 100644 arch/riscv/include/asm/insn.h > delete mode 100644 arch/riscv/include/asm/parse_asm.h > create mode 100644 arch/riscv/lib/strcmp.S > create mode 100644 arch/riscv/lib/strcmp_zbb.S > create mode 100644 arch/riscv/lib/strlen.S > create mode 100644 arch/riscv/lib/strlen_zbb.S > create mode 100644 arch/riscv/lib/strncmp.S > create mode 100644 arch/riscv/lib/strncmp_zbb.S > > -- > 2.35.1 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@xxxxxxxxxxxxxxxxxxx > http://lists.infradead.org/mailman/listinfo/linux-riscv