Search Postgresql Archives

Re: Obtaining random rows from a result set

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Aug 31, 2007, at 8:34 AM, Kaloyan Iliev wrote:

Alban Hertroys wrote:

Hello,

I've recently been busy improving a query that yields a fixed number of random records matching certain conditions. I have tried all the usual approaches, and although they do work, they're all limited in some way and don't translate really well to what you "want". They're kludges, IMHO.

The methods I've tried are explained quite well on
http://people.planetpostgresql.org/greg/index.php?/archives/40- Getting-random-rows-from-a-database-table.html

All these methods involve calculating a random number for every record
in the result set at some point in time, which is really not what I'm
trying to model. I think the database should provide some means to get
those records, so...

Dear Santa,

I'd like my database to have functionality analogue to how LIMIT works,
but for other - non-sequential - algorithms.

I was thinking along the lines of:

	SELECT *
	  FROM table
	 WHERE condition = true
	 RANDOM 5;

Which would (up to) return 5 random rows from the result set, just as
LIMIT 5 returns (up to) the first 5 records in the result set.


Or maybe even with a custom function, so that you could get non- linear
distributions:

	SELECT *
	  FROM table
	 WHERE condition = true
	 LIMIT 5 USING my_func();

Where my_func() could be a user definable function accepting a number that should be (an estimate of?) the number of results being returned so
that it can provide pointers to which rows in the resultset will be
returned from the query.

Examples:
* random(maxrows) would return random rows from the resultset.
* median() would return the rows in the middle of the result set (this
would require ordering to be meaningful).

What do people think, is this feasable? Desirable? Necessary?

If I'd have time I'd volunteer for at least looking into this, but I'm
working on three projects simultaneously already. Alas...

Regards,
Alban Hertroys.


Hi,
Why not generate a random number in your application and then:

SELECT *
FROM table_x
WHERE condition = true
OFFSET generated_random_number
LIMIT xx

Kaloyan Iliev


That won't work without some kind of a priori knowledge of how many rows the query would return without the offset and limit.

Erik Jones

Software Developer | Emma®
erik@xxxxxxxxxx
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org/


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux