Hello all, long time no chit-chat on the PG mailing list.
We’re upgrading from 8.0.3 to 8.3 and found that some stored procedures
utilizing int_agg that we had left over from 7.3 had terrible
performance. No problem, using ANY() we’re able to regain that
performance, more or less, and at the same time greatly simplify our stored procedures.
But things can never be fast enough, can they? So I have a question or
two. Here’s my function for reference: CREATE OR REPLACE FUNCTION bg_nodes2descendants(INT[], INT)
RETURNS SETOF connection_generation AS ' DECLARE _row connection_generation%ROWTYPE; _children INT[]; BEGIN -- this is faster than constructing in the loop below --_children = array(SELECT connectee_node_id FROM
connections WHERE connection_type_id = 1 AND connector_node_id = ANY($1)); FOR _row IN SELECT connection_id, connection_type_id,
connector_node_id, connector_node_type_id, connectee_node_id, connectee_node_type_id,
current, timestamp, $2 + 1 FROM connections
WHERE connection_type_id = 1 AND connector_node_id = ANY($1) LOOP _children :=
_children || _row.connectee_node_id; RETURN NEXT _row; END LOOP; IF FOUND THEN RETURN QUERY SELECT * FROM
bg_nodes2descendants(_children, $2+1); END IF; RETURN; END ' LANGUAGE 'plpgsql'; So, my concern is alluded to in the
comment above. When I use this function in places where it returns large
results, building the _children array directly (in the commented out line) is
about 25% faster. But I’d like to avoid building the children array
altogether and would instead like to generate that array from the already
collected output rows. For example, right before the recursive call, I’d
like to select a column of the buffered output rows, cast it to an integer[],
and pass it into the recursive call. Is there an internal value I can
access for this such as: _children := array(SELECT connectee_node_id FROM $output); Bonus question - if I rewrite the first FOR loop as: RETURN QUERY SELECT connection_id, connection_type_id,
connector_node_id, connector_node_type_id, connectee_node_id, connectee_node_type_id,
current, timestamp, $2 + 1 FROM connections
WHERE connection_type_id = 1 AND connector_node_id = ANY($1); I get “ERROR: structure of query does not match
function result type”, even though the type signatures of the returned
columns match the “connection_generation” rowtype. I am pretty
sure this could be resolved by casting the resulting columns to that row type,
but I am lost as to how the syntax to do such a thing would look. Thanks in advance for the help, and keep up the great
work. PG8.3 is an amazing piece of software and it blows me away how much
more advanced it gets with every release. Bart Grantham VP of R&D Logicworks Inc. – Complex and Managed Hosting |