On Sun, Aug 11, 2013 at 9:59 PM, Victor Hooi <victorhooi@xxxxxxxxx> wrote: > Hmm, aha, so the ORDER BY RANDOM behaviour hasn't changed - just to confirm > - this means that Postgres will duplicate the table, add a new column, > generate random numbers for every record, then sort by that new column, > right? It doesn't duplicate the table, it sec scans it and uses top-N sort if we use limit, and memory or disc sort depending on the data size if we don't use limit. > I've just read the above anecdotally on the internet, but I'm curious if the > actual implementation is documented somewhere officially apart from the > source? Running the query through EXPLAIN didn't seem to tell me much > additional information. I can not say about official docs, but you will find a good sorting explanation here http://www.depesz.com/2013/05/09/explaining-the-unexplainable-part-3/ > @Sergey - Thanks for the tip about using WITH RECURSIVE. I'm actually doing > something similar in my application code in Django - basically take the max > id, then generate a random integer between 0 and max id. However, it is > dependent on how evenly distributed the record IDs are - in our case, if we > delete a large number of records, it might affect things. You can try to look at pg_stats.histogram_bounds to work the issue around, however it is just my assumption, I have newer tried it. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray.ru@xxxxxxxxx -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general