How to speedup intarray aggregate function?

"Dmitry Koterov" <dmitry@xxxxxxxxxx> · Wed, 10 Oct 2007 03:01:58 +0400

Hello.

I created an aggregate:

CREATE AGGREGATE intarray_aggregate_push (_int4)
(
  STYPE = _int4,
  SFUNC = intarray_push_array,
  INITCOND = '{}'
);

(or - I may use _int_union instead of intarray_push_array, its speed is practically the same in my case).

This aggregate merges together a list of integer[] arrays resulting one big array with all elements.

Then I want to use this aggregate:

SELECT intarray_aggregate_push(arrayfield)
FROM arraytable

The table arraytable contains a lot of rows (about 5000), each row has array with length of 5-10 elements, so - the resulting array should contain about 50000 elements.

The query is okay, but its speed is too bad: about 1 second.

The main problem is the speed of intarray_aggregate_push function - it is quite slow, because intarray_push_array reallocates the memory each time I merge two arrays. I am pretty sure that the reallocaton and copying is the bottleneck, because if I use another dummy aggreate:

CREATE AGGREGATE intarray_aggregate_dummy (_int4)
(
  STYPE = _int4,
  SFUNC = dummy,
  INITCOND = '{}'
);

CREATE OR REPLACE FUNCTION "public"."dummy" (a integer [], b integer []) RETURNS integer [] AS

$body$ BEGIN RETURN a; END; $body$
LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;

where dummy() is the function which returns its first argument without any modification, the speed grows dramatically - about 25 ms (instead of 1000 ms!).

The question is: how could I optimize this, and is it possible at all in Postgres? I just want to get one large array glued from a lot of smaller arrays...

P.S.

I have tested that 

SELECT array_to_string(ARRAY(SELECT text FROM tbl), ' ')

query is many times faster than joining of all "text" fields inside one pg/plsql stored function (I assume that it is because Postgres do not reallocate & copy memory each time it glues a new text piece). But unfortunately there is no way to convert integer[] to string to use this method: I could write

select '{1,2}'::integer[]

but I couldn't use

select ARRAY[1,2]::text