On Thu Feb 29, 2024 at 4:57 PM EET, Stefan Berger wrote: > > > On 2/29/24 04:11, Lukas Wunner wrote: > > On Fri, Feb 23, 2024 at 03:41:40PM -0500, Stefan Berger wrote: > >> +static inline void ecc_digits_from_bytes(const u8 *in, unsigned int nbytes, > >> + u64 *out, unsigned int ndigits) > >> +{ > >> + unsigned int sz = ndigits << ECC_DIGITS_TO_BYTES_SHIFT; > >> + u8 tmp[ECC_MAX_DIGITS << ECC_DIGITS_TO_BYTES_SHIFT]; > >> + unsigned int o = sz - nbytes; > >> + > >> + memset(tmp, 0, o); > >> + memcpy(&tmp[o], in, nbytes); > >> + ecc_swap_digits(tmp, out, ndigits); > >> +} > > > > Copying the whole key into tmp seems inefficient. You only need > > special handling for the first few bytes of "in" (6 bytes in the > > P521 case) and could use ecc_swap_digits() to convert the rest > > of "in" directly to "out" without using tmp. > > > > So it would be sufficient to allocate the first digit on the stack, > > memset + memcpy, then convert that to native byte order into "in[0]" > > and use ecc_swap_digits() for the rest. > > > > And the special handling would be conditional on "!o", so is skipped > > for existing curves. > > Thanks. It looks like this now: > > static inline void ecc_digits_from_bytes(const u8 *in, unsigned int nbytes, > u64 *out, unsigned int ndigits) > { > unsigned int o = nbytes & 7; > u64 msd = 0; > size_t i; > > if (o == 0) { > ecc_swap_digits(in, out, ndigits); > } else { > for (i = 0; i < o; i++) > msd = (msd << 8) | in[i]; > out[ndigits - 1] = msd; > ecc_swap_digits(&in[o], out, ndigits - 1); This would be more stream-lined IMHO: unsigned int o = nbytes & 7; unsigned int n = ndigits; u64 msd = 0; size_t i; if (o != 0) { for (i = 0; i < o; i++) msd = (msd << 8) | in[i]; out[--n] = msd; } ecc_swap_digits(in, out, n); BR, Jarkko