Re: [PATCH v2 02/20] crypto: x86/chacha - expose SIMD ChaCha routine as library function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 4 Oct 2019 at 15:36, Jason A. Donenfeld <Jason@xxxxxxxxx> wrote:
>
> On Wed, Oct 02, 2019 at 04:16:55PM +0200, Ard Biesheuvel wrote:
> > Wire the existing x86 SIMD ChaCha code into the new ChaCha library
> > interface.
> >
> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>
> > ---
> >  arch/x86/crypto/chacha_glue.c | 36 ++++++++++++++++++++
> >  crypto/Kconfig                |  1 +
> >  include/crypto/chacha.h       |  6 ++++
> >  3 files changed, 43 insertions(+)
> >
> > diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c
> > index bc62daa8dafd..fd9ef42842cf 100644
> > --- a/arch/x86/crypto/chacha_glue.c
> > +++ b/arch/x86/crypto/chacha_glue.c
> > @@ -123,6 +123,42 @@ static void chacha_dosimd(u32 *state, u8 *dst, const u8 *src,
> >       }
> >  }
> >
> > +void hchacha_block(const u32 *state, u32 *stream, int nrounds)
> > +{
> > +     state = PTR_ALIGN(state, CHACHA_STATE_ALIGN);
> > +
> > +     if (!crypto_simd_usable()) {
> > +             hchacha_block_generic(state, stream, nrounds);
> > +     } else {
> > +             kernel_fpu_begin();
> > +             hchacha_block_ssse3(state, stream, nrounds);
> > +             kernel_fpu_end();
> > +     }
> > +}
> > +EXPORT_SYMBOL(hchacha_block);
>
> Please correct me if I'm wrong:
>
> The approach here is slightly different from Zinc. In Zinc, I had one
> entry point that conditionally called into the architecture-specific
> implementation, and I did it inline using #includes so that in some
> cases it could be optimized out.
>
> Here, you override the original symbol defined by the generic module
> from the architecture-specific implementation, and in there you decide
> which way to branch.
>
> Your approach has the advantage that you don't need to #include a .c
> file like I did, an ugly yet very effective approach.
>
> But it has two disadvantages:
>
> 1. For architecture-specific code that _always_ runs, such as the
>   MIPS32r2 implementation of chacha, the compiler no longer has an
>   opportunity to remove the generic code entirely from the binary,
>   which under Zinc resulted in a smaller module.
>

It does. If you don't call hchacha_block_generic() in your code, the
library that exposes it never gets loaded in the first place.

Note that in this particular case, hchacha_block_generic() is exposed
by code that is always builtin so it doesn't matter.

> 2. The inliner can't make optimizations for that call.
>
> Disadvantage (2) might not make much of a difference. Disadvantage (1)
> seems like a bigger deal. However, perhaps the linker is smart and can
> remove the code and symbol? Or if not, is there a way to make the linker
> smart? Or would all this require crazy LTO which isn't going to happen
> any time soon?



[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux