Re: Recursive use

"Jay A. Kreibich" <jak@xxxxxxxx> · Tue, 10 Oct 2006 16:33:46 -0500

On Tue, Oct 10, 2006 at 10:15:42AM -0500, Jim C. Nasby scratched on the wall:
> On Fri, Oct 06, 2006 at 10:37:26AM -0500, Jay A. Kreibich wrote:
> >   These are generally referred to as "Hierarchical Queries" and center
> >   around the idea of a self-referencing table (such as an employee
> >   table with a "manager" field that is a FK to another row in the same
> >   table).  This essentially makes a tree-like structure.
> <snip>
> >   As pointed out by others, the most general way to deal with this in
> >   PostgreSQL is to write PL/PgSQL (or some other language) functions
> >   that can generate the specific queries you need.  It isn't always
> >   pretty, but it can be made to work for a specific set of queries.
> 
> There are also other ways to represent this type of information without
> using hierarchical queries. Joe Celko presents two methods in SQL For
> Smarties.

  If you're referring to Joe's March 1996 DBMS article,
  (http://www.dbmsmag.com/9603d06.html) he does demonstrate two models,
  but one of them is the self-referencing table model where one column
  references another column in the same table.  His only suggestion for
  dealing with these kinds of tables is self-joins (which I also
  mentioned) but points out the obvious limitation that-- unless you go
  procedural-- you have to know how many levels you're going to process
  before you setup the query.

  The other model that is shown (which he calls "nested-set") is
  interesting, but has a lot of properties that make me uncomfortable. 
  (He proposes each node/row have two sequence counters ("left" and "right")
  represent pre- and post-visit order in a depth-first traversal; sets
  can be calculated by differences or betweens of the two values).
  For one, the table requires an extreme amount of maintenance-- something
  as simple as inserting a single leaf node may require updating every
  row in the whole table.  On average, more than half the nodes/rows will
  require updating for each record insertion and removal, but it isn't clear
  how this update process would work (since the sequences require a
  traversal to update, but a proper traversal requires the correct
  sequences).  There are tricks for the simple cases, but I'm not sure
  you could do an update in-place in the general case.

  The representation he's chosen also introduces an ordering among siblings--
  while this is a required attribute of some tree structures, in most 
  cases (and in the spirit of general SQL sets) the ordering of peer
  nodes/rows is undefined and unimportant.  This isn't exactly a flaw,
  so much as an unexpected side-effect.

  In theory, I agree with his assertion that a conceptual "nested sets"
  approach is more SQLish (since SQL likes to deal with sets), but I don't
  think the implementation he presented actually has anything to do with
  sets (in the traditional sense) that are nested.  The whole thing depends
  on understanding traversal orderings and some of the tricks you can play
  with that to indirectly define sets.  I guess it all depends on how you
  look at it.  I personally tend to think more in C++ than SQL anyways.

  I also noticed that Joe has a book out titled "Joe Celko's Trees and
  Hierarchies in SQL for Smarties".  I have not yet had a chance to
  review this book (other than the on-line table of contents) but it
  looks interesting.  While much of this is on graphs and more general
  edge/node structures, a fair bit of the book appears to be about this
  type of tree structure.  He goes into more detail on some of these
  issues, such as insertion and deletion times, and tricks to play for
  inserting whole sub-trees, and that kind of thing.  Maybe the book
  would sell the so-called "nested-set" implementation a bit better,
  but it still strikes me as a solution for warehouses, not OLTP style
  stuff.  I might have to find this book and have a closer read.

  Thanks for the reference.

> There's also the ltree module in contrib that might be of some use.

  Interesting.

   -j

-- 
                     Jay A. Kreibich | CommTech, Emrg Net Tech Svcs
                        jak@xxxxxxxx | Campus IT & Edu Svcs
          <http://www.uiuc.edu/~jak> | University of Illinois at U/C