Hello.
I created an aggregate:
CREATE AGGREGATE intarray_aggregate_push (_int4)
(
STYPE = _int4,
SFUNC = intarray_push_array,
INITCOND = '{}'
);
(or - I may use _int_union instead of intarray_push_array, its speed is practically the same in my case).
This aggregate merges together a list of integer[] arrays resulting one big array with all elements.
Then I want to use this aggregate:
SELECT intarray_aggregate_push(arrayfield)
FROM arraytable
The table arraytable contains a lot of rows (about 5000), each row has array with length of 5-10 elements, so - the resulting array should contain about 50000 elements.
The query is okay, but its speed is too bad: about 1 second.
The main problem is the speed of intarray_aggregate_push function - it is quite slow, because intarray_push_array reallocates the memory each time I merge two arrays. I am pretty sure that the reallocaton and copying is the bottleneck, because if I use another dummy aggreate:
CREATE AGGREGATE intarray_aggregate_dummy (_int4)
(
STYPE = _int4,
SFUNC = dummy,
INITCOND = '{}'
);
CREATE OR REPLACE FUNCTION "public"."dummy" (a integer [], b integer []) RETURNS integer [] AS
$body$ BEGIN RETURN a; END; $body$
LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;
where dummy() is the function which returns its first argument without any modification, the speed grows dramatically - about 25 ms (instead of 1000 ms!).
The question is: how could I optimize this, and is it possible at all in Postgres? I just want to get one large array glued from a lot of smaller arrays...
P.S.
I have tested that
SELECT array_to_string(ARRAY(SELECT text FROM tbl), ' ')
query is many times faster than joining of all "text" fields inside one pg/plsql stored function (I assume that it is because Postgres do not reallocate & copy memory each time it glues a new text piece). But unfortunately there is no way to convert integer[] to string to use this method: I could write
select '{1,2}'::integer[]
but I couldn't use
select ARRAY[1,2]::text