plpgsql arrays

Matthew Wakeling <matthew@xxxxxxxxxxx> · Fri, 3 Apr 2009 14:32:54 +0100 (BST)

I'm writing a plpgsql function that effectively does a merge join on the 
results of two queries. Now, it appears that I cannot read the results of 
two queries as streams in plpgsql, so I need to copy the contents of one 
query into an array first, and then iterate over the second query 
afterwards.

I have discovered that creating large arrays in plpgql is rather slow. In 
fact, it seems to be O(n^2). The following code fragment is incredibly 
slow:

 genes = '{}';
 next_new = 1;
 FOR loc IN SELECT location.* FROM location, gene WHERE location.subjectid = gene.id ORDER BY objectid, intermine_start, intermine_end LOOP
     genes[next_new] = loc;
     IF (next_new % 10000 = 0) THEN
         RAISE NOTICE 'Scanned % gene locations', next_new;
     END IF;
     next_new = next_new + 1;
 END LOOP;
 genes_size = coalesce(array_upper(genes, 1), 0);
 RAISE NOTICE 'Scanned % gene locations', genes_size;

For 200,000 rows it takes 40 minutes.

So, is there a way to dump the results of a query into an array quickly in 
plpgsql, or alternatively is there a way to read two results streams 
simultaneously?

Matthew

--
I would like to think that in this day and age people would know better than
to open executables in an e-mail. I'd also like to be able to flap my arms
and fly to the moon.                                    -- Tim Mullen

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance