On 18 October 2016 at 19:34, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > Andy Colson <andy@xxxxxxxxxxxxxxx> writes: >> On 10/18/2016 11:44 AM, Francisco Olarte wrote: >>> This should be faster, but to me it seems it does a different thing. > >> Ah, yes, you're right, there is a bit of a difference there. > > If you don't want to have an implicit bias towards earlier blocks, > I don't think that either standard tablesample method is really what > you want. > > The contrib/tsm_system_rows tablesample method is a lot closer, in > that it will start at a randomly chosen block, but if you just do > "tablesample system_rows(1)" then you will always get the first row > in whichever block it lands on, so it's still not exactly unbiased. Is there a reason why we can't fix the behaviours of the three methods mentioned above by making them all start at a random block and a random item between min and max? It wasn't ever intended to be biased and bernoulli in particular ought to have a strict no bias. Happy to patch if we agree. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general