Search Postgresql Archives

Re: tablesample performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18 October 2016 at 19:34, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
> Andy Colson <andy@xxxxxxxxxxxxxxx> writes:
>> On 10/18/2016 11:44 AM, Francisco Olarte wrote:
>>> This should be faster, but to me it seems it does a different thing.
>
>> Ah, yes, you're right, there is a bit of a difference there.
>
> If you don't want to have an implicit bias towards earlier blocks,
> I don't think that either standard tablesample method is really what
> you want.
>
> The contrib/tsm_system_rows tablesample method is a lot closer, in
> that it will start at a randomly chosen block, but if you just do
> "tablesample system_rows(1)" then you will always get the first row
> in whichever block it lands on, so it's still not exactly unbiased.

Is there a reason why we can't fix the behaviours of the three methods
mentioned above by making them all start at a random block and a
random item between min and max?

It wasn't ever intended to be biased and bernoulli in particular ought
to have a strict no bias.

Happy to patch if we agree.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux