On Thu, Oct 26, 2006 at 12:52:05PM +0200, Vincent Ladeuil wrote: > Ok, so git make a distinction between the commit (code created by > someone) and the tree (code only). Yes (a commit is a tree, zero or more parents, commit message, and author/committer info). > Commits are defined by their parents. Partially, yes. > Trees are defined by their content only ? Yes. > Calculate a sha1 representing the content (or the content of the > diff from parent) of all the files and dirs in the tree ? Or > from the sha1s of the files and dirs themselves recursively based > on sha1s of the files and dirs they contain ? Recursively. Each tree is an ordered list of 4-tuples: pathname, type, sha1, mode. If the type is "blob" then the sha1 is the hash of the file contents. If the type is "tree" then the sha1 is the id of a sub-tree. The id of a tree is the sha1 hash of the data structure. > I ask because the later seems to provide some nice effects > similar to what makes BDD > (http://en.wikipedia.org/wiki/Binary_decision_diagram) so > efficient: you can compare graphs of any complexity or size in > O(1) by just comparing their signatures. Yes, if two trees' hashes compare equal, they contain the same data. I believe we are not currently using this optimization to find merge differences, but there was some discussion earlier this week about doing so. -Peff - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html