Re: [RFC PATCH 11/20] crypto: BLAKE2s - x86_64 implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019-09-30 04:42:06 [+0200], Jason A. Donenfeld wrote:
> Hi Sebastian, Thomas,
Hi Jason,

> On Sun, Sep 29, 2019 at 7:39 PM Ard Biesheuvel
> <ard.biesheuvel@xxxxxxxxxx> wrote:
> > +       for (;;) {
> > +               const size_t blocks = min_t(size_t, nblocks,
> > +                                           PAGE_SIZE / BLAKE2S_BLOCK_SIZE);
> > +
> > +               kernel_fpu_begin();
> > +               if (IS_ENABLED(CONFIG_AS_AVX512) && blake2s_use_avx512)
> > +                       blake2s_compress_avx512(state, block, blocks, inc);
> > +               else
> > +                       blake2s_compress_avx(state, block, blocks, inc);
> > +               kernel_fpu_end();
> > +
> > +               nblocks -= blocks;
> > +               if (!nblocks)
> > +                       break;
> > +               block += blocks * BLAKE2S_BLOCK_SIZE;
> > +       }
> > +       return true;
> > +}
> 
> I'm wondering if on modern kernels this is actually fine and whether
> my simd_get/put/relax thing no longer has a good use case.
> Specifically, I recall last year there were a lot of patches and
> discussions about doing FPU register restoration lazily -- on context
> switch or the like. Did those land? Did the theory of action work out
> in the end?

That optimisation landed in v5.2. With that change (on x86)
kernel_fpu_end() does almost nothing (and so the following
kernel_fpu_begin()) and the FPU context will be restored once the task
returns to user land. (Note that this counts only for user-tasks because
we don't save the FPU state of a kernel thread.)
I haven't look at crypto code since that change was merged but looking
at the snippet from Ard is actually what I was aiming for.

> Regards,
> Jason

Sebastian



[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux