This series is not properly sent. See v2 instead: https://lore.kernel.org/all/20240815133357.35829-1-xry111@xxxxxxxxxxx/ On Thu, 2024-08-15 at 21:17 +0800, Xi Ruoyao wrote: > For the rationale to implement getrandom() in vDSO see [1]. > > The vDSO getrandom() needs a stack-less ChaCha20 implementation, so we > need to add architecture-specific code and wire it up with the generic > code. > > Without LSX it's not easy to implement ChaCha20 without stack. So the > current implementation just falls back to a getrandom() syscall if LSX > is unavailable. In the 1st patch the existing alternative runtime > patching mechanism is expanded to cover vDSO in the first patch, so we > don't need to invoke cpucfg for each vDSO getrandom() call. > > Then in the 2nd patch stack-less ChaCha20 is implemented with LSX. The > code is basically a direct translate from the x86 SSE2 implementation. > One annoying thing here is the compiler generates a memset() call for a > "large" struct initialization in a cold path and there seems no way to > prevent it. So a naive memset implementation is copied from the kernel > code into vDSO. > > The implementation is tested with the kernel selftests added by the last > patch in [1]. I had to make some adjustments to make it work on > LoongArch (see [2], I've not submitted the changes as at now because I'm > unsure about the KHDR_INCLUDES addition). The vdso_test_getrandom > bench-single result: > > vdso: 25000000 times in 0.631345201 seconds > libc: 25000000 times in 6.953121083 seconds > syscall: 25000000 times in 6.992112386 seconds > > The vdso_test_getrandom bench-multi result: > > vdso: 25000000 x 256 times in 29.558284986 seconds > libc: 25000000 x 256 times in 356.633930139 seconds > syscall: 25000000 x 256 times in 334.885555338 seconds > > [1]:https://lore.kernel.org/all/20240712014009.281406-1-Jason@xxxxxxxxx/ > [2]:https://github.com/xry111/linux/commits/xry111/la-vdso/ > > Cc: linux-crypto@xxxxxxxxxxxxxxx > Cc: loongarch@xxxxxxxxxxxxxxx > Cc: Jason A. Donenfeld <Jason@xxxxxxxxx> > Cc: Huacai Chen <chenhuacai@xxxxxxxxxx> > Cc: WANG Xuerui <kernel@xxxxxxxxxx> > Cc: Jinyang He <hejinyang@xxxxxxxxxxx> > Cc: Tiezhu Yang <yangtiezhu@xxxxxxxxxxx> > Cc: Arnd Bergmann <arnd@xxxxxxxx> > > Xi Ruoyao (2): > LoongArch: Perform alternative runtime patching on vDSO > LoongArch: vDSO: Wire up getrandom() vDSO implementation > > arch/loongarch/Kconfig | 1 + > arch/loongarch/include/asm/vdso/getrandom.h | 47 ++++++ > arch/loongarch/include/asm/vdso/vdso.h | 8 + > arch/loongarch/kernel/asm-offsets.c | 10 ++ > arch/loongarch/kernel/vdso.c | 14 +- > arch/loongarch/vdso/Makefile | 2 + > arch/loongarch/vdso/memset.S | 24 +++ > arch/loongarch/vdso/vdso.lds.S | 7 + > arch/loongarch/vdso/vgetrandom-alt.S | 19 +++ > arch/loongarch/vdso/vgetrandom-chacha.S | 162 ++++++++++++++++++++ > arch/loongarch/vdso/vgetrandom.c | 16 ++ > 11 files changed, 309 insertions(+), 1 deletion(-) > create mode 100644 arch/loongarch/include/asm/vdso/getrandom.h > create mode 100644 arch/loongarch/vdso/memset.S > create mode 100644 arch/loongarch/vdso/vgetrandom-alt.S > create mode 100644 arch/loongarch/vdso/vgetrandom-chacha.S > create mode 100644 arch/loongarch/vdso/vgetrandom.c > -- Xi Ruoyao <xry111@xxxxxxxxxxx> School of Aerospace Science and Technology, Xidian University