Thanks to Gavin and Martijn for their suggestions.
They're both simple good-ol' LCGs, and both avoid the need to
check for collisions.
I ultimately went with a multiplicative LCG, like
Martijn's, mostly because I understand better how it avoids
collisions, so it was easier for me to tweak it in various
ways.
In particular, I changed the prime number from 899981 to
the very lucky prime 900001. This happens to work
*perfectly*, because the range of such a generator is p-1, not
p. (BTW, Martijn's choice of the "random" 2345 for the
multiplier was a somewhat lucky one, since such generators are
not full for arbitrary multipliers; for example, the one with
modulus 899981 is not full for a multiplier of 3456, say.)
I also followed Martijn's pointer regarding the 3-argument
form of python's pow function, and implemented a 3-argument
pow for PL/PgSQL. I include all the code below, including a
snippet borrowed from Gavin's post, and modified here and
there. (I'm not very experienced with PL/PgSQL, so please
feel free to point out ways in which my PL/PgSQL code can be
improved.)
First the functions:
CREATE OR REPLACE
FUNCTION pow_mod(bigx bigint, n bigint, m bigint) returns
bigint AS $$
DECLARE
x bigint;
xx bigint;
BEGIN
IF n = 0 THEN
RETURN 1; END IF;
x := bigx % m;
xx := (x * x) %
m;
IF n % 2 = 0
THEN
RETURN
pow_mod(xx, n/2, m);
ELSE
RETURN (x *
pow_mod(xx, (n-1)/2, m)) % m;
END IF;
END;
$$ LANGUAGE
plpgsql strict immutable;
-- "mcg" =
"multiplicative congruential generator"
CREATE OR REPLACE
FUNCTION mcg_900001(i bigint) returns int AS $$
BEGIN
-- CHECK (0
< i AND i < 900001)
RETURN 99999 +
pow_mod(<INSERT YOUR MULTIPLIER HERE>, i, 900001);
END;
$$ LANGUAGE
plpgsql strict immutable;
And here's a small demo:
DROP TABLE IF
EXISTS rtab;
DROP SEQUENCE IF
EXISTS rseq;
CREATE SEQUENCE
rseq;
CREATE TABLE rtab
(
id int
PRIMARY KEY DEFAULT mcg_900001(nextval('rseq')),
payload int
NOT NULL
);
\timing on \\
INSERT INTO rtab (payload) VALUES (generate_series(1,
900000)); \timing off
-- Timing is
on.
-- INSERT 0
900000
-- Time:
201450.781 ms
-- Timing is
off.
SELECT * FROM
rtab WHERE 449990 < payload AND payload < 450011;
-- id |
payload
--
--------+---------
-- 539815 |
449991
-- 901731 |
449992
-- 878336 |
449993
-- 564275 |
449994
-- 863664 |
449995
-- 720159 |
449996
-- 987833 |
449997
-- 999471 |
449998
-- 999977 |
449999
-- 999999 |
450000
-- 921739 |
450001
-- 722684 |
450002
-- 596638 |
450003
-- 121592 |
450004
-- 687895 |
450005
-- 477734 |
450006
-- 585988 |
450007
-- 942869 |
450008
-- 175776 |
450009
-- 377207 |
450010
-- (20 rows)
kj