On Tue, Aug 3, 2010 at 8:01 AM, Jason Schauberger <crossroads0000@xxxxxxxxxxxxxx> wrote: > Dear fellow Postgres users, :-) > > please consider this table: > > CREATE TABLE nodes ( > > id int PRIMARY KEY, > > parent int REFERENCES nodes(id) > > ); > > In this table, each node *can* have a parent node. You can picture the > whole set of rows of this table as one or more trees with nodes and > the root of the tree is the node which has no parent node (that is, > parent is NULL). > > Now here's my objective: I want to *quickly* find all nodes that have > the same root node. > > I could iterate through the complete tree in question starting from > the root node and going all the way through to the terminal/leaf nodes > (those without a child), but that could take quite long depending on > how large the tree is. > > So, what I could do is this: > > CREATE TABLE nodes ( > > id int PRIMARY KEY, > > parent int REFERENCES nodes(id), > > root int NOT NULL REFERENCES nodes(id) > > ); > > and fill out the root column every time I insert a node. But then > there is the problem of anomalies, where the root column could have an > incorrect value (that is, the id of a node which is actually NOT the > root of the same tree). I guess I could check for correctness using > triggers. yes, you should absolutely do that -- the role of the database is to (as much as possible) make anomalies in your data impossible. If you go this route, set the root_id node in the trigger. Postgres triggers aren't strictly portable, but should be able to be migrated with minimal effort. Looking it up in the application is possible, but undesirable IMO. If you are only specifically interested in the root node, you can materialize it in the application and index it. This isn't a general solution for investigating familial relationships of nodes, but it doesn't sound like you need something general. The general solution that is going to be most portable is going probably going to be a materialized path approach. You encode the parents of a node in a string (in postgres you can use an array) in such away that you can pull out specific parents_ids. Materialized approaches have at least one big downside: updates to the tree are extremely expensive. Here is a pretty neat way of doing nested data with farey fractions with SQL examples designed for Oracle -- they should be able to be ported to postgres without too much effort: http://arxiv.org/html/cs/0401014. I've been looking for an excuse to play with it! merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general