Search Postgresql Archives

Re: intermittant performance problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I think (a part of) Your problem is that order by random() is O(N
logN) complexity, while You are after O(N) .

The solution (in pseudocode)

random_sample(resultset,K):
  result := first K rows from resultset
  resultset.scrollto(K+1)
  p = K+1
  while(resultset.hasMoreRows())
        row = resultset.current()
        resultset.advance()
        if(random() < K/p)
              replace a random element in result with row
        p = p+1
  return result


the invariant being that at each step result contains a random sample
of K elements.

It should be fairly easy to implement in plpgsql.

Greetings
Marcin

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux