On Sun, Aug 16, 2009 at 12:57:34PM +0100, Thom Brown wrote: > > SELECT array_to_string(array(( > > SELECT SUBSTRING('abcdefghjklmnpqrstuvwxyz23456789' > > FROM mod((random()*32)::int, 32)+1 FOR 1) > > FROM generate_series(1,5))),''); I've just had a look and PG does actually seem to be returning values as I'd expect, i.e. 0 <= n < 1. So the following would work: floor(random()*32)::int+1 instead of the mod hack. Distribution looks reasonably flat (this is good): char %occurances 1 3.1222 2 3.1329 3 3.1269 4 3.1236 5 3.1233 6 3.1310 7 3.1226 8 3.1298 9 3.1229 10 3.1294 11 3.1192 12 3.1249 13 3.1267 14 3.1236 15 3.1190 16 3.1279 17 3.1232 18 3.1218 19 3.1314 20 3.1091 21 3.1337 22 3.1239 23 3.1184 24 3.1347 25 3.1205 26 3.1160 27 3.1219 28 3.1344 29 3.1118 30 3.1256 31 3.1408 32 3.1255 > I'm not sure why you're saying that there's a 50% > chance of duplication after 7240 values though. With 33 million > combinations, I would have thought that duplications would become equally > likely at the 16,777,216 mark. No, that's why I pointed out birthday attacks---collisions happen much more often than you'd expect. Get 23 people in a room and you have a 50% chance of two people having the same birthday--not 150 people. This is why it's called the birthday attack and it's one of the basic tests for hash functions--any bias in their output will shrink this number even further. -- Sam http://samason.me.uk/ -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general