On Sat Mar 2, 2024 at 4:00 PM EET, Lukas Wunner wrote: > On Fri, Mar 01, 2024 at 10:26:29PM +0200, Jarkko Sakkinen wrote: > > On Thu Feb 29, 2024 at 4:57 PM EET, Stefan Berger wrote: > > > > > > > > > On 2/29/24 04:11, Lukas Wunner wrote: > > > > On Fri, Feb 23, 2024 at 03:41:40PM -0500, Stefan Berger wrote: > > > >> +static inline void ecc_digits_from_bytes(const u8 *in, unsigned int nbytes, > > > >> + u64 *out, unsigned int ndigits) > > > >> +{ > > > >> + unsigned int sz = ndigits << ECC_DIGITS_TO_BYTES_SHIFT; > > > >> + u8 tmp[ECC_MAX_DIGITS << ECC_DIGITS_TO_BYTES_SHIFT]; > > > >> + unsigned int o = sz - nbytes; > > > >> + > > > >> + memset(tmp, 0, o); > > > >> + memcpy(&tmp[o], in, nbytes); > > > >> + ecc_swap_digits(tmp, out, ndigits); > > > >> +} > > > > > > > > Copying the whole key into tmp seems inefficient. You only need > > > > special handling for the first few bytes of "in" (6 bytes in the > > > > P521 case) and could use ecc_swap_digits() to convert the rest > > > > of "in" directly to "out" without using tmp. > > > > > > > > So it would be sufficient to allocate the first digit on the stack, > > > > memset + memcpy, then convert that to native byte order into "in[0]" > > > > and use ecc_swap_digits() for the rest. > > > > > > > > And the special handling would be conditional on "!o", so is skipped > > > > for existing curves. > > > > > > Thanks. It looks like this now: > > > > > > static inline void ecc_digits_from_bytes(const u8 *in, unsigned int nbytes, > > > u64 *out, unsigned int ndigits) > > > { > > > unsigned int o = nbytes & 7; > > > u64 msd = 0; > > > size_t i; > > > > > > if (o == 0) { > > > ecc_swap_digits(in, out, ndigits); > > > } else { > > > for (i = 0; i < o; i++) > > > msd = (msd << 8) | in[i]; > > > out[ndigits - 1] = msd; > > > ecc_swap_digits(&in[o], out, ndigits - 1); > > > > This would be more stream-lined IMHO: > > > > unsigned int o = nbytes & 7; > > unsigned int n = ndigits; > > u64 msd = 0; > > size_t i; > > > > if (o != 0) { > > for (i = 0; i < o; i++) > > msd = (msd << 8) | in[i]; > > > > out[--n] = msd; > > } > > > > ecc_swap_digits(in, out, n); > > Maybe eliminate the for-loop as well? > > unsigned int o = nbytes & 7; > u64 msd = 0; > > if (o != 0) { > /* if key length is not a multiple of 64 bits (NIST P521) */ > memcpy((u8 *)&msd + sizeof(msd) - o, in, o); > out[--ndigits] = be64_to_cpu(msd); > in += o; > } > > ecc_swap_digits(in, out, ndigits); +1 BR, Jarkko