Search Postgresql Archives

Re: Obtaining random rows from a result set

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



To follow up on my own post, I came up with a workable solution based on
scrolling cursors. The SP approach didn't work out for me, I didn't
manage to declare a cursor in PL/pgSQL that could be positioned
absolutely (maybe that's due to us still using PG 8.1.something?).

A solution to that would be appreciated.

Anyway, I solved the problem in our application (PHP). I even got a
workable solution to prevent returning the same record more than once.
Here goes:

function randomSet($query, $limit, $uniqueColumn) {

	// queries; depends on your DB connector
	DECLARE _cur SCROLL CURSOR WITHOUT HOLD FOR $query;
	MOVE FORWARD ALL IN _cur;

	//GET DIAGNOSTICS _count := ROW_COUNT;
	$count = pg_affected_rows();

	$uniques = array();
	$resultSet = array();
	while ($limit > 0 && count($uniques) < $count) {
		$idx = random(1, $count);

		//query
		$record = FETCH ABSOLUTE $idx FROM _cur;

		// Skip records with a column value we want to be unique
		if (in_array($record[$uniqueColumn], $uniques)
			continue;

		$uniques[] = $record[$uniqueColumn];
		$resultSet[] = $record;
		$limit--;
	}

	// query
	CLOSE _cur;

	return $resultSet;
}

I hope this is useful to anyone. It worked for us; it is definitely
faster than order by random(), and more random than precalculated column
values. Plus it translates directly to what we are requesting :)

Alban Hertroys wrote:
> I thought of another solution (with only a few calculations of random())
> that can be deployed in existing versions of PG, using a set-returning
> function with a scrolling cursor that accepts the query string as input
> like this (in pseudoish-code):
> 
> ----
> create function random(text _query, integer _limit)
> returns set
> volatile
> as $$
> DECLARE
>     _cur cursor;
>     _cnt bigint;
>     _idx integer;
>     _rowpos bigint;
> 
>     _rec record;
> BEGIN
>     open _cur for execute query;
>     fetch forward all into _rec;
>     -- select total nr of records into _cnt
> 
>     for _idx in 1.._limit loop
>         _rowpos := random() * _cnt;
> 
>         fetch absolute _rowpos into _rec;
>         return next _rec;
>     end loop;
> 
>     return;
> END;
> $$
> language 'plpgsql';
> ----

-- 
Alban Hertroys
alban@xxxxxxxxxxxxxxxxx

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux