On Fri, 27 Sep 2019 at 09:21, Jason A. Donenfeld <Jason@xxxxxxxxx> wrote: > > Hey Andy, > > Thanks for weighing in. > > > inlining. I'd be surprised for chacha20. If you really want inlining > > to dictate the overall design, I think you need some real numbers for > > why it's necessary. There also needs to be a clear story for how > > exactly making everything inline plays with the actual decision of > > which implementation to use. > > Take a look at my description for the MIPS case: when on MIPS, the > arch code is *always* used since it's just straight up scalar > assembly. In this case, the chacha20_arch function *never* returns > false [1], which means it's always included [2], so the generic > implementation gets optimized out, saving disk and memory, which I > assume MIPS people care about. > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/zx2c4/linux.git/tree/lib/zinc/chacha20/chacha20-mips-glue.c?h=jd/wireguard#n13 > [2] https://git.kernel.org/pub/scm/linux/kernel/git/zx2c4/linux.git/tree/lib/zinc/chacha20/chacha20.c?h=jd/wireguard#n118 > > I'm fine with considering this a form of "premature optimization", > though, and ditching the motivation there. > > On Thu, Sep 26, 2019 at 11:37 PM Andy Lutomirski <luto@xxxxxxxxxx> wrote: > > My suggestion from way back, which is at > > least a good deal of the way toward being doable, is to do static > > calls. This means that the common code will call out to the arch code > > via a regular CALL instruction and will *not* inline the arch code. > > This means that the arch code could live in its own module, it can be > > selected at boot time, etc. > > Alright, let's do static calls, then, to deal with the case of going > from the entry point implementation in lib/zinc (or lib/crypto, if you > want, Ard) to the arch-specific implementation in arch/${ARCH}/crypto. > And then within each arch, we can keep it simple, since everything is > already in the same directory. > > Sound good? > Yup. I posted something to this effect - I am ironing out some wrinkles doing randconfig builds (with Arnd's help) but the general picture shouldn't change.