Re: Flaw in "random32: update the net random state on interrupt and activity"

Willy Tarreau <w@xxxxxx> · Sat, 8 Aug 2020 00:19:13 +0200

On Fri, Aug 07, 2020 at 12:59:48PM -0700, Marc Plumb wrote:
> 
> On 2020-08-07 10:43 a.m., Willy Tarreau wrote:
> > 
> > > Which means that it's 2^32 effort to brute force this (which Amit called "no
> > > biggie for modern machines"). If the noise is the raw sample data with only
> > > a few bits of entropy, then it's even easier to brute force.
> > Don't you forget to multiply by another 2^32 for X being folded onto itself ?
> > Because you have 2^32 possible values of X which will give you a single 32-bit
> > output value for a given noise value.
> 
> If I can figure the state out once,

Yes but how do you take that as granted ? This state doesn't appear
without its noise counterpart, so taking as a prerequisite that you may
guess one separately obviously indicates that you then just have to
deduce the other, but the point of mixing precisely is that we do not
expose individual parts.

This way of thinking is often what results in extreme solutions to be
designed, which are far away from the reality of the field of application,
and result in unacceptable costs that make people turn to other solutions.
Do you think it makes me happy to see people waste their time reimplementing
alternate userland TCP stacks that are supposedly "way faster" by getting
rid of all the useless (to them) stuff that was forced on them at the cost
of performance ? And it makes me even less happy when they ask me why I'm
not spending more time trying to adopt them. The reality is that this time
could be better spent optimizing our stack to be sure that costs are added
where they are relevant, and not just to make sure that when we align 7
conditions starting with "imagine that I could guess that", the 8th could
be guessed as well, except that none of these can really be guessed outside
of a lab :-/

> then the only new input is the noise, so
> that's the only part I have to brute force. Throwing the noise in makes it
> more difficult to get that state once, but once I have it then this type of
> reseeding doesn't help.

> I think it might be possible to do a decent CPRNG (that's at
> least had some cryptanalys of it) with ~20 instructions per word, but if
> that's not fast enough then I'll think about other options.

I think that around 20 instructions for a hash would definitely be nice
(but please be aware that we're speaking about RISC-like instructions,
not SIMD instructions). And also please be careful not to count only
with amortized performance that's only good to show nice openssl
benchmarks, because if that's 1280 instructions for 256 bits that
result in 20 instructions per 32-bit word, it's not the same anymore
at all!

Regards,
Willy