On Fri, Mar 01, 2024 at 10:26:29PM +0200, Jarkko Sakkinen wrote: > On Thu Feb 29, 2024 at 4:57 PM EET, Stefan Berger wrote: > > > > > > On 2/29/24 04:11, Lukas Wunner wrote: > > > On Fri, Feb 23, 2024 at 03:41:40PM -0500, Stefan Berger wrote: > > >> +static inline void ecc_digits_from_bytes(const u8 *in, unsigned int nbytes, > > >> + u64 *out, unsigned int ndigits) > > >> +{ > > >> + unsigned int sz = ndigits << ECC_DIGITS_TO_BYTES_SHIFT; > > >> + u8 tmp[ECC_MAX_DIGITS << ECC_DIGITS_TO_BYTES_SHIFT]; > > >> + unsigned int o = sz - nbytes; > > >> + > > >> + memset(tmp, 0, o); > > >> + memcpy(&tmp[o], in, nbytes); > > >> + ecc_swap_digits(tmp, out, ndigits); > > >> +} > > > > > > Copying the whole key into tmp seems inefficient. You only need > > > special handling for the first few bytes of "in" (6 bytes in the > > > P521 case) and could use ecc_swap_digits() to convert the rest > > > of "in" directly to "out" without using tmp. > > > > > > So it would be sufficient to allocate the first digit on the stack, > > > memset + memcpy, then convert that to native byte order into "in[0]" > > > and use ecc_swap_digits() for the rest. > > > > > > And the special handling would be conditional on "!o", so is skipped > > > for existing curves. > > > > Thanks. It looks like this now: > > > > static inline void ecc_digits_from_bytes(const u8 *in, unsigned int nbytes, > > u64 *out, unsigned int ndigits) > > { > > unsigned int o = nbytes & 7; > > u64 msd = 0; > > size_t i; > > > > if (o == 0) { > > ecc_swap_digits(in, out, ndigits); > > } else { > > for (i = 0; i < o; i++) > > msd = (msd << 8) | in[i]; > > out[ndigits - 1] = msd; > > ecc_swap_digits(&in[o], out, ndigits - 1); > > This would be more stream-lined IMHO: > > unsigned int o = nbytes & 7; > unsigned int n = ndigits; > u64 msd = 0; > size_t i; > > if (o != 0) { > for (i = 0; i < o; i++) > msd = (msd << 8) | in[i]; > > out[--n] = msd; > } > > ecc_swap_digits(in, out, n); Maybe eliminate the for-loop as well? unsigned int o = nbytes & 7; u64 msd = 0; if (o != 0) { /* if key length is not a multiple of 64 bits (NIST P521) */ memcpy((u8 *)&msd + sizeof(msd) - o, in, o); out[--ndigits] = be64_to_cpu(msd); in += o; } ecc_swap_digits(in, out, ndigits);