Hi Thomas, On Sat, Nov 26, 2022 at 12:08:41AM +0100, Thomas Gleixner wrote: > Jason! > > On Thu, Nov 24 2022 at 17:55, Jason A. Donenfeld wrote: > > +++ b/arch/x86/entry/vdso/vgetrandom-chacha.S > > +/* > > + * Very basic SSE2 implementation of ChaCha20. Produces a given positive number > > + * of blocks of output with a nonce of 0, taking an input key and 8-byte > > + * counter. Importantly does not spill to the stack. Its arguments are: > > Basic or not. Heh, FYI I didn't mean "basic" here as in "doesn't need a review", but just that it's a straightforward technique and doesn't do any complicated multiblock pyrotechnics (which frankly aren't really needed). > This needs a Reviewed-by from someone who understands SSE2 > and ChaCha20 before this can go anywhere near the x86 tree. No problem. I'll see to it that somebody qualified gives this a review. > > +#include <linux/kernel.h> > > Why do you need kernel.h here? Turns out I don't, thanks. > > +static __always_inline ssize_t > > +getrandom_syscall(void *buffer, size_t len, unsigned int flags) > > static __always_inline ssize_t getrandom_syscall(void *buffer, size_t len, unsigned int flags) > > please. We expanded to 100 quite some time ago. > > Some kernel-doc compliant comment for this would be appreciated as well. Will do. > > > +{ > > + long ret; > > + > > + asm ("syscall" : "=a" (ret) : > > + "0" (__NR_getrandom), "D" (buffer), "S" (len), "d" (flags) : > > + "rcx", "r11", "memory"); > > + > > + return ret; > > +} > > + > > +#define __vdso_rng_data (VVAR(_vdso_rng_data)) > > + > > +static __always_inline const struct vdso_rng_data *__arch_get_vdso_rng_data(void) > > +{ > > + if (__vdso_data->clock_mode == VDSO_CLOCKMODE_TIMENS) > > + return (void *)&__vdso_rng_data + > > + ((void *)&__timens_vdso_data - (void *)&__vdso_data); > > + return &__vdso_rng_data; > > So either bite the bullet and write it: > > if (__vdso_data->clock_mode == VDSO_CLOCKMODE_TIMENS) > return (void *)&__vdso_rng_data + ((void *)&__timens_vdso_data - (void *)&__vdso_data); Seems fine to me. I'll write it like that. > > +/* > > + * Generates a given positive number of block of ChaCha20 output with nonce=0, > > + * and does not write to any stack or memory outside of the parameters passed > > + * to it. This way, we don't need to worry about stack data leaking into forked > > + * child processes. > > Please use proper kernel-doc > > > + */ > > +static __always_inline void __arch_chacha20_blocks_nostack(u8 *dst_bytes, const u32 *key, u32 *counter, size_t nblocks) > > +{ > > + extern void chacha20_blocks_nostack(u8 *dst_bytes, const u32 *key, u32 *counter, size_t nblocks); > > + return chacha20_blocks_nostack(dst_bytes, key, counter, nblocks); > > The above aside, can you please explain the value of this __arch_() > wrapper? > > It's just voodoo for no value because it hands through the arguments > 1:1. So where are you expecting that that __arch...() version of this is > any different than invoking the architecture specific version of > chacha20_blocks_nostack(). I'll just name the assembly function with __arch...(). The idea behind the wrapper was just to keep all of the non-generic code called from the generic code prefixed with __arch_, but there's no reason I need to name it like that from C alone. Will fix for v8. Thanks again, Jason