Re: Randomness sources for the IETF 2015-2016 Nomcom Selection

John C Klensin <john-ietf@xxxxxxx> · Mon, 22 Jun 2015 19:38:06 -0400

--On Monday, June 22, 2015 14:52 -0700 Ted Faber <faber@xxxxxxx>
wrote:

> Look, I think this issue is silly.  I'm just going to leave
> this link here for peoples' enjoyment, though.

+1

FWIW, there was a time in my life when I had the dubious
privilege of having to pay a lot of attention to methods of
generating random numbers and how random the results were.  It
didn't turn out to be a lot of fun, but the nearly-intuitive
parts are:

* Using values from a selection of independent random and
pseudo-random sources is a lot safer than relying on one source.
That is, of course, what the IETF process tries to do.  If
sufficient analysis is done to know how a particular number
varies, dropping a number of leading and trailing digits based
on that knowledge can improve randomness.  For example, the
leading digits of a lot of public statistics (including, e.g.,
debt or stock market values) don't fluctuate very much on a day
over day basis so one is better off not including them in the
calculation.  However, doing the analysis is a lot of trouble
and the effects after the hashing described in RFC 3797 may be
trivial -- we should try to remember that there is really no
such thing perfect numeric randomness and that the goal is,
necessarily, "good enough"

* You don't want to use lotteries unless (i) You have sound,
long-term, statistical reasons to know that they are really
random.  Historically, with no deliberate efforts at
manipulation, some of them have rather poor track records.  (ii)
If you use multiple lotteries, you need to be sure that their
results are uncorrelated.  In practice, that is likely to mean
that you need to be sure that they don't use the same random
number generation process for a seed-based process, knowing that
they are using different seeds may not be enough.  (Although
with the same qualification about "good enough" as above.)

* Again, the strength of the IETF value selection process is
that it draws from several sources of numbers that we have a
reasonable expectation of being independent.   That can be much
more important than how random a particular source actually is.
Put differently, if there were evidence that the national debt
figure for Absurdistan was being manipulated, it would make
little or no difference as long as it was large/long enough and
we chose digits that varied a lot (e.g., were not the result of
predictable rounding behavior and for which there was no
systematic correlation among digits in a group).  

All of that said, making a rule that those making up the
selection rules were not permitted to use debt (or any other
national statistic) from the same country two years in a row (or
even two years out of three or five) would strike me as a
political statement.  Given the way the whole procedure works,
repeated use of one country's statistic(s) to get some digits
seems no more political than a choice of lotteries or horse
races.

We now return you to the next regularly-scheduled demonstration
of the ability of the IETF list to generate very long treats
about topics that make very little difference.

  best,
   john
   (wearing my long-disused hat as a card-carrying statistician)