Thom Brown wrote:
I'm not sure why you're saying that there's a 50%
chance of duplication after 7240 values though. With 33 million
combinations, I would have thought that duplications would become equally
likely at the 16,777,216 mark.
Basic probability.
<http://en.wikipedia.org/wiki/Birthday_attack>
<http://en.wikipedia.org/wiki/Birthday_problem>
Sam Mason wrote:
No, that's why I pointed out birthday attacks---collisions happen much
more often than you'd expect. Get 23 people in a room and you have a
50% chance of two people having the same birthday--not 150 people. This
is why it's called the birthday attack and it's one of the basic tests
for hash functions--any bias in their output will shrink this number
even further.
Taking the birthday example, the chance that no two people in a group of size
n < 365 have the same birthday (irrespective of year) is
(365-0)/365 x (365-1)/365 x (365-2)/365 x (365-3)/365 ... x (365 - (n-1))/365
As each term in the expression is less than one, their multiplication together
rapidly approaches zero. That means the probability of two or more people
having the same birthday rapidly approaches one. The 50% point is a bit under
n=23.
Substitute 33 million for 365 for the OP's problem.
--
Lew
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general