In article <ffnid8$1q2t$1@xxxxxxxxxxxx>, cluster <skrald@xxxxxxxxxx> wrote: % > How important is true randomness? % % The goal is an even distribution but currently I have not seen any way % to produce any kind of random sampling efficiently. Notice the word How about generating the ctid randomly? You can get the number of pages from pg_class and estimate the number of rows either using the number of tuples in pg_class or just based on what you know about the data. Then just generate two series of random numbers, one from 0 to the number of pages and the other from 1 to the number of rows per page, and keep picking rows until you have enough numbers. Assuming there aren't too many dead tuples and your estimates are good, this should retrieve n rows with roughly n look-ups. If your estimates are low, there will be tuples which can never be selected, and so far as I know, there's no way to construct a random ctid in a stock postgres database, but apart from that it seems like a good plan. If efficiency is important, you could create a C function which returns a series of random tids and join on that. -- Patrick TJ McPhee North York Canada ptjm@xxxxxxxxxxxx ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster