Immutable table functions

Luiz Angelo Daros de Luca <luizluca@xxxxxxxxx> · Fri, 12 Feb 2010 10:54:58 -0200

Hello
guys,

I
don't know if this is the correct list. Correct me if I'm wrong.

I
have a directed graph, or better, a tree in postgresql 8.3. One table
are the nodes and another one are the connections. Given any node, I
need to get all nodes down to it(to the leafs) that have relation
with anotherTable. Also, this connection change on time, so I have a
connectionDate and a disconnectionDate for each connection (which
might be null to represent open interval). This way, I wrote a pgsql
function (I rename the tables and columns to generic names). These
are the tables and the function:

CREATE
TABLE node (
id_node
integer NOT NULL,
	CONSTRAINT
node_pkey PRIMARY KEY (id_node)
);
CREATE
TABLE  anotherTable
(

id_anotherTable integer NOT NULL,
  id_node
integer NOT NULL,

CONSTRAINT anothertable_pkey PRIMARY KEY
(id_anotherTable),

CONSTRAINT anothertable_node_fkey FOREIGN KEY (id_node)
REFERENCES node. id_node

MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION
);
CREATE
TABLE connection
(

id_connection integer NOT NULL,
  down
integer NOT NULL,
  up
integer NOT NULL,

connectionDate timestamp with time zone,

disconnectionDate timestamp with time zone,

CONSTRAINT connection_pkey PRIMARY KEY (id_connection),

CONSTRAINT down_fkey FOREIGN KEY (down)

REFERENCES  (id_node) REFERENCES node. id_node MATCH
SIMPLE
      ON
UPDATE NO ACTION ON DELETE NO ACTION,

CONSTRAINT up_fkey FOREIGN KEY (up)

REFERENCES  (id_node) REFERENCES node. id_node MATCH
SIMPLE
      ON
UPDATE NO ACTION ON DELETE NO ACTION);
CREATE
TABLE  observation
(

id_observation integer NOT NULL,
  id_node
integer NOT NULL,
  date
timestamp with time zone,  

CONSTRAINT observation_pkey PRIMARY KEY
(id_observation),

CONSTRAINT observation_node_fkey FOREIGN KEY (id_node)
REFERENCES node. id_node

MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION
);

CREATE
OR REPLACE FUNCTION
get_nodes_related_to_anothertable(integer,timestamp with time zone)
RETURNS SETOF integer AS 'DECLARE
��� _id ALIAS FOR
$1;
��� _date ALIAS FOR $2;
���
_conn RECORD;
BEGIN 
��� return query SELECT 1
FROM anothertable WHERE id_node  = _id;
��� FOR _
conn IN SELECT * FROM connection c where c.up = _id LOOP
���
��� if _conn. connectionDate > _date then
���
��� ��� continue;
���
��� end if;
��� ���
if _conn. disconnectionDate < _data then
��� ���
��� continue;
��� ���
end if;
��� ��� return query SELECT
* from get_nodes_related_to_anothertable(_conn.down, _date);
���
END LOOP;
END' LANGUAGE 'plpgsql' IMMUTABLE;

And
I use it on my SELECT:

SELECT
	*
FROM
	(SELECT

		id_node,
		date
	FROM

		observation
	)
root_node_obs,
	node,
	anotherTable
WHERE
	anotherTable.id_node
= node.id_node
	AND
	node.id_node
IN (
		select
* from get_nodes_related_to_anothertable(root_node_obs
.id_node,root_node_obs .date));

Even
with IMMUTABLE on the function, postgresql executes the function many
times with the same parameters. In a single run:

select
* from get_nodes_related_to_anothertable(236,now());

it
returns 5 rows and runs in 27ms. But in the bigger SELECT, it take 5s
to each observation row (and I may have some :-) ).

I
know that IN generally is not optimization-friendly but I don't know
how to use my function without it.
Any
clues guys?

Thanks,