On Tue, Aug 27, 2024 at 09:20:14PM +0800, Xi Ruoyao wrote: > diff --git a/arch/loongarch/vdso/vgetrandom-chacha.S b/arch/loongarch/vdso/vgetrandom-chacha.S > new file mode 100644 > index 000000000000..2e42198f2faf > --- /dev/null > +++ b/arch/loongarch/vdso/vgetrandom-chacha.S > @@ -0,0 +1,239 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (C) 2024 Xi Ruoyao <xry111@xxxxxxxxxxx>. All Rights Reserved. > + */ > + > +#include <asm/asm.h> > +#include <asm/regdef.h> > +#include <linux/linkage.h> > + > +.text > + > +/* Salsa20 quarter-round */ > +.macro QR a b c d > + add.w \a, \a, \b > + xor \d, \d, \a > + rotri.w \d, \d, 16 > + > + add.w \c, \c, \d > + xor \b, \b, \c > + rotri.w \b, \b, 20 > + > + add.w \a, \a, \b > + xor \d, \d, \a > + rotri.w \d, \d, 24 > + > + add.w \c, \c, \d > + xor \b, \b, \c > + rotri.w \b, \b, 25 > +.endm > + > +/* > + * Very basic LoongArch implementation of ChaCha20. Produces a given positive > + * number of blocks of output with a nonce of 0, taking an input key and > + * 8-byte counter. Importantly does not spill to the stack. Its arguments > + * are: > + * > + * a0: output bytes > + * a1: 32-byte key input > + * a2: 8-byte counter input/output > + * a3: number of 64-byte blocks to write to output > + */ > +SYM_FUNC_START(__arch_chacha20_blocks_nostack) I can confirm this works: $ loongarch64-unknown-linux-gnu-gcc -std=gnu99 -D_GNU_SOURCE= -idirafter tools/testing/selftests/../../../tools/include -idirafter tools/testing/selftests/../../../arch/loongarch/include -idirafter tools/testing/selftests/../../../include -D__ASSEMBLY__ -Wa,--noexecstack vdso_test_chacha.c tools/testing/selftests/../../../tools/arch/loongarch/vdso/vgetrandom-chacha.S -o tools/testing/selftests/vDSO/vdso_test_chacha -static $ qemu-loongarch64 ./vdso_test_chacha TAP version 13 1..1 ok 1 chacha: PASS Just waiting now on a v5 as discussed and acks from the LoongArch maintainers on that v5. Jason