On Thu, 26 Oct 2006, Vincent Ladeuil wrote: > >>>>> "Linus" == Linus Torvalds <torvalds@xxxxxxxx> writes: > > Linus> Commits are defined by a _combination_ of: > > Linus> - the tree they commit (which is recursive, so the > Linus> commit name indirectly includes information EVERY > Linus> SINGLE BIT in the whole tree, in every single file) > > And here you keep that separate from any SCM related info, > right ? I don't understand that question. The commits contain the tree information. A raw commit in git (this is the true contents of the current top commit in my kernel tree, just added indentation and an empty line between the command I used to generate it and the output, to make it stand out better in the email) looks something like this: [torvalds@g5 linux]$ git-cat-file commit HEAD tree ba1ed8c744654ca91ee2b71b7cdee149c8edbef1 parent 2a4f739dfc59edd52eaa37d63af1bd830ea42318 parent 012d64ff68f304df1c35ce5902f5023dc14b643f author Linus Torvalds <torvalds@xxxxxxxxxxx> 1161873881 -0700 committer Linus Torvalds <torvalds@xxxxxxxxxxx> 1161873881 -0700 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Fix memory corruption in pci_4u_free_consistent(). [SPARC64]: Fix central/FHC bus handling on Ex000 systems. where the _name_ of the commit is [torvalds@g5 linux]$ git-rev-parse HEAD e80391500078b524083ba51c3df01bbaaecc94bb ie the commit itself contains the exact tree name (and the name of the parents), and the name of the commit is literally the SHA1 of the contents of the commit (plus a git-specific header). > >> Trees are defined by their content only ? > > Linus> Where "contents" does include names and > Linus> permissions/types (eg execute bit and symlink etc). > > Which can also be expressed as: "Everything the user can > manipulate outside the SCM context", right ? Again, I'm not sure what you mean by that. The SCM does not track _everything_. It does not track user names and inode numbers, so in a sense a developer can change things that the SCM simply doesn't _care_ about and never tracks. But yes, the tree contents uniquely identify the exact contents that the user cares about. > Linus> If you compare the commit name, and they are equal, > Linus> you automatically know > > Linus> - the trees are 100% identical > Linus> - the histories are 100% identical > > And that's the only info you can get, no ordering here. No, there is ordering there too. But yes, the ordering is not in the name itself, you have to go look at the actual commit history to see it. The name is just an identifier. > Linus> If you only care about the actual tree, you compare > Linus> the tree name for equality, ie you can do > > Linus> git-rev-parse commit1^{tree} commit2^{tree} > > Linus> and compare the two: if and only if they are equal are > Linus> the actual contents 100% equal. > > Actually, that's backwards: > > "their actual contents are equal" implies "their signatures are > equal". No. If the signatures are equal, the contents are equal, and vice versa. It really is a two-way thing. > But, two totally different trees can have the same signature. No. Don't even think that way. That just confuses you. The hash is cryptographic, and large enough, that you really can equate the contents with the hash. Anything else is just not even interesting. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html