On Fri, Aug 19, 2016 at 3:20 PM, Daniel Verite <daniel@xxxxxxxxxxxxxxxx> wrote: > There's a simple technique that works on top of a Feistel network, > called the cycle-walking cipher. Described for instance at: > http://web.cs.ucdavis.edu/~rogaway/papers/subset.pdf > I'm using the opportunity to add a wiki page: > https://wiki.postgresql.org/wiki/Pseudo_encrypt_constrained_to_an_arbitrary_range > with sample plgsql code for the [0..10,000,000] range that might be useful > for other cases. Nice reference, nice WikiPage, nice job. Bookmarking every thing. > But for the btree fragmentation and final size issue, TBH I don't expect > that constraining the values within that smaller range will make any > difference > in the tests, because it's the dispersion that matters, not the values > themselves. Neither do I, that is why I stated probabley good enough for tests, but.... > I mean that, whether the values are well dispersed in the [0..1e7] range or > equally well dispersed in the [0..2**32] range, the probability of a newly > inserted value to compare greater or lower to any previous values of the list > should be the same, so shouldn't the page splits be the same, statistically > speaking? I know some btrees do prefix-coding of the keys, and some do it even with long integers, and some others do delta-coding for integer keys. I seem to recall postgres does not, that's whay I do not think it will make a difference. But anyway, to compare two things like that, as the original poster was doing, I normally prefer to test just one thing at a time, that's why I would normally try to do it by writing a sorted file, shuffling it with sort -R, and copying it, server side if posible, to eliminate so both Francisco Olarte. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general