On Thu, Jan 20, 2011 at 7:00 AM, Richard Quadling <rquadling@xxxxxxxxx>wrote: > I'd recommend using a nested set approach for the tags > (http://dev.mysql.com/tech-resources/articles/hierarchical-data.html > gives a good explanation on the issues and methodology of nested > sets). > Thanks for the link. That article proposes an interesting way to organize the categories. Have you implemented this in the wild? Clearly the design would work as it's pretty simple, and I like that it removes the need for recursive queries. Dotan, the Venn diagrams are just used to explain the concept. If you use the code to determine the left and right values, you can ignore the diagrams entirely. As long as you're not adding/removing categories every minute, having to recalculate left and right values isn't that big of a deal. Also, there's no reason you couldn't keep the parent_id field with the nested sets. It would come in handy for certain types of queries, though it's not necessary. On Thu, Jan 20, 2011 at 11:40 AM, Jerry Schwartz <jerry@xxxxxxxxx> wrote: > I disagree. The method I proposed can be extended to any depth, and any > leaf or branch can be retrieved with a single query. > The nested set method can be extended to any depth, and it pays off more the larger the hierarchy grows. While you can retrieve any branch (all ancestors) of a node with a single SQL query, the SQL engine itself actually must perform a recursive query meaning multiple hits on the parent_id index. On Thu, Jan 20, 2011 at 11:59 AM, Dotan Cohen <dotancohen@xxxxxxxxx> wrote: > I suppose for retrievals this structure has advantages, but unless > MySQL has a ++ operator (or better yet, one that adds or subtracts 2 > from an int) then it looks to be a pain to add nodes. > ++ or += wouldn't be any better here than x = x + 2. Once you're modifying indexed values, you'll pay a much higher price writing to disk than += could ever save you in CPU cycles. The beauty is that inserting a node requires only two update statements that will fix *all* categories that need to be adjusted. Adding categories to the hierarchical model is definitely faster so it comes down to your insert-to-select ratio. Moving a subtree is also much easier with the hierarchical model. David