On Tue, Oct 10, 2006 at 10:15:42AM -0500, Jim C. Nasby scratched on the wall: > On Fri, Oct 06, 2006 at 10:37:26AM -0500, Jay A. Kreibich wrote: > > These are generally referred to as "Hierarchical Queries" and center > > around the idea of a self-referencing table (such as an employee > > table with a "manager" field that is a FK to another row in the same > > table). This essentially makes a tree-like structure. > <snip> > > As pointed out by others, the most general way to deal with this in > > PostgreSQL is to write PL/PgSQL (or some other language) functions > > that can generate the specific queries you need. It isn't always > > pretty, but it can be made to work for a specific set of queries. > > There are also other ways to represent this type of information without > using hierarchical queries. Joe Celko presents two methods in SQL For > Smarties. If you're referring to Joe's March 1996 DBMS article, (http://www.dbmsmag.com/9603d06.html) he does demonstrate two models, but one of them is the self-referencing table model where one column references another column in the same table. His only suggestion for dealing with these kinds of tables is self-joins (which I also mentioned) but points out the obvious limitation that-- unless you go procedural-- you have to know how many levels you're going to process before you setup the query. The other model that is shown (which he calls "nested-set") is interesting, but has a lot of properties that make me uncomfortable. (He proposes each node/row have two sequence counters ("left" and "right") represent pre- and post-visit order in a depth-first traversal; sets can be calculated by differences or betweens of the two values). For one, the table requires an extreme amount of maintenance-- something as simple as inserting a single leaf node may require updating every row in the whole table. On average, more than half the nodes/rows will require updating for each record insertion and removal, but it isn't clear how this update process would work (since the sequences require a traversal to update, but a proper traversal requires the correct sequences). There are tricks for the simple cases, but I'm not sure you could do an update in-place in the general case. The representation he's chosen also introduces an ordering among siblings-- while this is a required attribute of some tree structures, in most cases (and in the spirit of general SQL sets) the ordering of peer nodes/rows is undefined and unimportant. This isn't exactly a flaw, so much as an unexpected side-effect. In theory, I agree with his assertion that a conceptual "nested sets" approach is more SQLish (since SQL likes to deal with sets), but I don't think the implementation he presented actually has anything to do with sets (in the traditional sense) that are nested. The whole thing depends on understanding traversal orderings and some of the tricks you can play with that to indirectly define sets. I guess it all depends on how you look at it. I personally tend to think more in C++ than SQL anyways. I also noticed that Joe has a book out titled "Joe Celko's Trees and Hierarchies in SQL for Smarties". I have not yet had a chance to review this book (other than the on-line table of contents) but it looks interesting. While much of this is on graphs and more general edge/node structures, a fair bit of the book appears to be about this type of tree structure. He goes into more detail on some of these issues, such as insertion and deletion times, and tricks to play for inserting whole sub-trees, and that kind of thing. Maybe the book would sell the so-called "nested-set" implementation a bit better, but it still strikes me as a solution for warehouses, not OLTP style stuff. I might have to find this book and have a closer read. Thanks for the reference. > There's also the ltree module in contrib that might be of some use. Interesting. -j -- Jay A. Kreibich | CommTech, Emrg Net Tech Svcs jak@xxxxxxxx | Campus IT & Edu Svcs <http://www.uiuc.edu/~jak> | University of Illinois at U/C