I realized that ARM uses the generic memmove() implementation which is rather slow. This series adds the assembler optimized version for ARM. The corresponding recent Linux code doesn't fit into barebox anymore, so to merge the code the surroundings have to be updated first, hence the series is bigger than I like it to be. Sascha Signed-off-by: Sascha Hauer <s.hauer@xxxxxxxxxxxxxx> --- Sascha Hauer (10): ARM: Use optimized reads[bwl] and writes[bwl] functions ARM: rename logical shift macros push pull into lspush lspull ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+ ARM: update lib1funcs.S from Linux ARM: update findbit.S from Linux ARM: update io-* from Linux ARM: always assume the unified syntax for assembly code ARM: update memcpy.S and memset.S from Linux lib/string.c: export non optimized memmove as __default_memmove ARM: add optimized memmove arch/arm/Kconfig | 4 - arch/arm/Makefile | 3 + arch/arm/cpu/cache-armv4.S | 11 +- arch/arm/cpu/cache-armv5.S | 13 +- arch/arm/cpu/cache-armv6.S | 13 +- arch/arm/cpu/cache-armv7.S | 9 +- arch/arm/cpu/hyp.S | 3 +- arch/arm/cpu/setupc_32.S | 7 +- arch/arm/cpu/sm_as.S | 3 +- arch/arm/include/asm/assembler.h | 36 ++++- arch/arm/include/asm/cache.h | 8 ++ arch/arm/include/asm/io.h | 24 ++++ arch/arm/include/asm/string.h | 4 +- arch/arm/include/asm/unified.h | 75 +---------- arch/arm/lib32/Makefile | 1 + arch/arm/lib32/ashldi3.S | 3 +- arch/arm/lib32/ashrdi3.S | 3 +- arch/arm/lib32/copy_template.S | 94 +++++++------ arch/arm/lib32/findbit.S | 243 +++++++++++++-------------------- arch/arm/lib32/io-readsb.S | 32 ++--- arch/arm/lib32/io-readsl.S | 32 ++--- arch/arm/lib32/io-readsw-armv4.S | 26 ++-- arch/arm/lib32/io-writesb.S | 34 ++--- arch/arm/lib32/io-writesl.S | 36 ++--- arch/arm/lib32/io-writesw-armv4.S | 16 +-- arch/arm/lib32/lib1funcs.S | 80 ++++++----- arch/arm/lib32/lshrdi3.S | 3 +- arch/arm/lib32/memcpy.S | 30 +++-- arch/arm/lib32/memmove.S | 206 ++++++++++++++++++++++++++++ arch/arm/lib32/memset.S | 96 ++++++++----- arch/arm/lib32/runtime-offset.S | 2 +- arch/arm/lib64/copy_template.S | 11 +- arch/arm/lib64/memcpy.S | 274 ++++++++++++++++++++++++++++++++------ arch/arm/lib64/memset.S | 18 ++- arch/arm/lib64/string.c | 17 +++ include/string.h | 2 + lib/string.c | 11 +- 37 files changed, 954 insertions(+), 529 deletions(-) --- base-commit: 419ea9350aa083d4a2806a70132129a49a5ecf95 change-id: 20240925-arm-assembly-memmove-8eccb9affa1b Best regards, -- Sascha Hauer <s.hauer@xxxxxxxxxxxxxx>