Hi, everyone. I'm working on an application that needs to perform a
query twice -- once to retrieve the total number of rows in a table, and
a second that then retrieves chunks of rows. The current implementation
executes the query twice, once with COUNT(*) and another with OFFSET and
LIMIT being passed to a pl/pgsql function. Needless to say, the
performance isn't so hot.
I've managed to improve things quite a bit by using cursors, but I've
been stumped in trying to find a replacement for the COUNT(*). I wrote
a function that works great on 9.0:
CREATE OR REPLACE FUNCTION count_the_rows() RETURNS INTEGER AS $$
DECLARE
mycursor CURSOR FOR SELECT * FROM test_table;
number_of_rows integer := 0;
BEGIN
OPEN mycursor;
MOVE ALL IN mycursor;
GET DIAGNOSTICS number_of_rows := ROW_COUNT;
RETURN number_of_rows;
END;
$$ LANGUAGE 'plpgsql';
Unfortunately, the project is using 8.3, and the function refuses to
even compile, due to the "MOVE ALL". When we change it to something
else (such as MOVE 20000000, which returns the actual number of rows
skipped to psql), or ABSOLUTE -1, we get 0 back from ROW_COUNT.
I know (from the recent discussion on pgsql-performance, among other
places) that count(*) is inherently slow. I'm excited to have found a
solution that at least allows us to avoid the execution of a complex
query twice in a row. But is there any way for me to get, in 8.3, the
number of rows over which a cursor has skipped? Keep in mind that after
this count has executed, we're then going to rewind the cursor, chunking
through the result set with a separate function.
Thanks in advance,
Reuven
--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general