Re: [PATCH v4 4/4] RISC-V: crypto: add accelerated GCM GHASH implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 11, 2023 at 3:30 AM Heiko Stübner <heiko@xxxxxxxxx> wrote:
>
> Hi Nathan,
>
> Am Dienstag, 11. April 2023, 17:00:00 CEST schrieb Nathan Huckleberry:
> > On Wed, Mar 29, 2023 at 7:08 AM Heiko Stuebner <heiko@xxxxxxxxx> wrote:
> > > +struct riscv64_ghash_ctx {
> > > +       void (*ghash_func)(u64 Xi[2], const u128 Htable[16],
> > > +                          const u8 *inp, size_t len);
> > > +
> > > +       /* key used by vector asm */
> > > +       u128 htable[16];
> >
> > This field looks too big. The assembly only loads the first 128-byte
> > value from this table.
>
> OpenSSL defines the Htable field handed to the init- and the other
> functions as this "u128 Htable[16]"    [0] . As I really like the concept
> of keeping in sync with openSSL, I guess I'd rather not change that.
>
> [0] https://github.com/openssl/openssl/blob/master/crypto/modes/gcm128.c#L88
>
>
> > Is this copied from another implementation? There's an optimization
> > where you precompute the first N powers of H so that you can perform 1
> > finite field reduction for every N multiplications, but it doesn't
> > look like that's being used here.
>
> The whole crypto-specific code comes from openSSL itself, so for now I
> guess I'd like to try keeping things the same.
>
>
> > > +#define RISCV64_ZBC_SETKEY(VARIANT, GHASH)                             \
> > > +void gcm_init_rv64i_ ## VARIANT(u128 Htable[16], const u64 Xi[2]);     \
> > > +static int riscv64_zbc_ghash_setkey_ ## VARIANT(struct crypto_shash *tfm,      \
> > > +                                          const u8 *key,               \
> > > +                                          unsigned int keylen)         \
> > > +{                                                                      \
> > > +       struct riscv64_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(tfm)); \
> > > +       const u64 k[2] = { cpu_to_be64(((const u64 *)key)[0]),          \
> > > +                          cpu_to_be64(((const u64 *)key)[1]) };        \
> > > +                                                                       \
> > > +       if (keylen != GHASH_BLOCK_SIZE)                                 \
> > > +               return -EINVAL;                                         \
> > > +                                                                       \
> > > +       memcpy(&ctx->key, key, GHASH_BLOCK_SIZE);                       \
> > > +       gcm_init_rv64i_ ## VARIANT(ctx->htable, k);                     \
> > > +                                                                       \
> > > +       ctx->ghash_func = gcm_ghash_rv64i_ ## GHASH;                    \
> > > +                                                                       \
> > > +       return 0;                                                       \
> > > +}
> >
> > I'd prefer three identical functions over a macro here. Code searching
> > tools and compiler warnings are significantly worse with macros.
>
> done :-)
>
>
> > > +
> > > +static int riscv64_zbc_ghash_update(struct shash_desc *desc,
> > > +                          const u8 *src, unsigned int srclen)
> > > +{
> > > +       unsigned int len;
> > > +       struct riscv64_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm));
> > > +       struct riscv64_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
> > > +
> > > +       if (dctx->bytes) {
> > > +               if (dctx->bytes + srclen < GHASH_DIGEST_SIZE) {
> > > +                       memcpy(dctx->buffer + dctx->bytes, src,
> > > +                               srclen);
> > > +                       dctx->bytes += srclen;
> > > +                       return 0;
> > > +               }
> > > +               memcpy(dctx->buffer + dctx->bytes, src,
> > > +                       GHASH_DIGEST_SIZE - dctx->bytes);
> > > +
> > > +               ctx->ghash_func(dctx->shash, ctx->htable,
> > > +                               dctx->buffer, GHASH_DIGEST_SIZE);
> > > +
> > > +               src += GHASH_DIGEST_SIZE - dctx->bytes;
> > > +               srclen -= GHASH_DIGEST_SIZE - dctx->bytes;
> > > +               dctx->bytes = 0;
> > > +       }
> > > +       len = srclen & ~(GHASH_DIGEST_SIZE - 1);
> > > +
> > > +       if (len) {
> > > +               gcm_ghash_rv64i_zbc(dctx->shash, ctx->htable,
> > > +                               src, len);
> > > +               src += len;
> > > +               srclen -= len;
> > > +       }
> > > +
> > > +       if (srclen) {
> > > +               memcpy(dctx->buffer, src, srclen);
> > > +               dctx->bytes = srclen;
> > > +       }
> > > +       return 0;
> > > +}
> > > +
> > > +static int riscv64_zbc_ghash_final(struct shash_desc *desc, u8 *out)
> > > +{
> > > +       int i;
> > > +       struct riscv64_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm));
> > > +       struct riscv64_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
> > > +
> > > +       if (dctx->bytes) {
> > > +               for (i = dctx->bytes; i < GHASH_DIGEST_SIZE; i++)
> > > +                       dctx->buffer[i] = 0;
> > > +               ctx->ghash_func(dctx->shash, ctx->htable,
> > > +                               dctx->buffer, GHASH_DIGEST_SIZE);
> >
> > Can we do this without an indirect call?
>
> hmm, the indirect call is in both riscv64_zbc_ghash_update() and
> riscv64_zbc_ghash_final() . And I found a missing one at the bottom
> of riscv64_zbc_ghash_update(), where gcm_ghash_rv64i_zbc() is
> called right now.
>
> Getting rid of the indirect call would mean duplicating both of these
> functions for all instances. Especially with the slightly higher
> complexity of the update this somehow seems not the best way to go.

Indirect calls are quite expensive. They are an issue for things like
disk/filesystem encryption because it introduces a ton of latency per
block.

I think this is a candidate for static calls. It looks like static
call support hasn't been accepted for riscv yet. Maybe just add a TODO
for now?

See:
https://lwn.net/Articles/771209/
https://lore.kernel.org/all/tencent_A8A256967B654625AEE1DB222514B0613B07@xxxxxx/

>
>
> Thanks for your pointers
> Heiko
>
>

Thanks,
Huck




[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]
  Powered by Linux