Re: Replacing Cursors with Temporary Tables

Merlin Moncure <mmoncure@xxxxxxxxx> · Thu, 22 Apr 2010 08:14:42 -0400

On Wed, Apr 21, 2010 at 4:16 PM, Eliot Gable
<egable+pgsql-performance@xxxxxxxxx> wrote:
> I have previously discussed my very long PL/PGSQL stored procedure on this
> list. However, without getting into too many details, I have another
> performance-related question.
>
> The procedure currently uses cursors to return multiple result sets to the
> program executing the procedure. Basically, I do this:
>
> BEGIN;
> SELECT * FROM stored_proc();
> FETCH ALL FROM cursor1;
> FETCH ALL FROM cursor2;
> FETCH ALL FROM cursor3;
> etc.
> COMMIT;
>
> However, there are some cases in the stored procedure where some of the
> result sets returned by these cursors are also needed as inputs to
> additional queries. To use them, I am currently doing:
>
> FOR temp IN cursorX LOOP
>   -- Some code that pushes the current temp record onto the end of an array
> END LOOP;
> OPEN cursorX;
> MOVE FIRST FROM cursorX;
>
> Then, when I need to use the results in a query, I do something like:
>
> SELECT * FROM table1 INNER JOIN (SELECT * FROM unnest(result_array)) AS
> table2 ON ( blah blah ) WHERE blah
>
> This seems extremely inefficient to me. First, I'm not sure of the penalty
> for unnesting an array into a SET OF object. Second, the set of records
> returned from unnesting would not be indexed for the join which means a
> sequential scan. Third, building the array in the first place using
> array_append seems extremely inefficient. Fourth, opening the cursor twice
> seems like it would execute the query twice, though given the placement and
> context, it's probably got it cached somewhere (hopefully). I'm sure there
> are probably other things I am overlooking.

*) don't use temp tables unless there is no other way (for example, if
the set is quite large)
*) unnest is cheap unless the array is large
*) Don't build arrays thay way:

declare a_cursor for a_query

becomes
CREATE FUNCTION get_foos(out foo[]) RETURNS record AS -- foo is a
table or composite type
$$
BEGIN
  select array (a_query) into foos;
  [...]
$$ language plpgsql;

In 8.4, we will manipulate the results typically like this:

WITH f AS (select unnest(foos) as foo)
SELECT * from f join bar on (f).foo.bar_id= bar.bar_id [...]

or this:
WITH f AS (select (foo).* from (select unnest(foos) as foo) q)
SELECT * from f join bar on f.bar_id= bar.bar_id [...]

This will use an index on bar.bar_id if it exists.  Obviously, any
indexes on foo are not used after creating the array but doesn't
matter much as long as the right side is indexed.  Your cursor method
does do any better in this regard.  You can create an index on a temp
table but the cost of building the index will probably be more than
any savings you get unless this is some type of special case, for
example if the left (temp table) side is big and you need to have it
sorted from that side.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance