Hi, First, let me start out by saying that I'm a fairly new contributor to Git, and I'm far less experienced than the other people on this thread. I've read through all the discussions time and again, and thought about the problem for some time now - I can't say I understand it as fully as many of you do, but I think I may have a slightly different perspective to offer. In what way is Git fundamentally different from Subversion? It's the simplicity of the data model. From the simplest building block, a key-value store, we have been able to compose and build things on top of it. The reason we built centralized version control systems earlier is because it was *easier* to address the composition problems. We dumped all related repository and problems into one central server. With so much information in one place, things are tightly coupled and problems are easier to solve. Still not convinced? What's the weakest component in Git today? Undoubtedly submodules. Ofcourse, a large part of the reason is that many people don't use submodules, and hence it doesn't improve -- but it's actually a circular problem. People don't use submodules, because it's so featureless and hard to develop. Why is it so hard? Back to the fundamental problem of composition from simple building blocks. In submodules, we have to take entire DAGs and build a composite DAG. The key pieces of information are deep inside Git's fundamnetals: Gitlinks. Other projects try like Gitslave try to attack the problem on a more superficial level, but they all hit a barrier when they discover that they can't compose big blocks of data: you need simple building blocks to compose. It's the same story with C (and now, Haskell). Why does everyone like C so much? Because it only provides fundamental building blocks and gives people the freedom to compose the way they like. It doesn't provide big "template blocks" like Java, because they tend to be restrictive in the long run. Sure, Java is easier to start out with, but people soon realize that big blocks can't compose. More than arguing about backward compatibility, and about how older versions of Git commits won't have generation numbers, I think this is what we should be focusing on. Sure, it'll additionally make sense to put in a cache to speed things up now, but we need to think about what Git will be 10~15 years from now. The fundamental pieces of information required for composition must be present in the fundamental building blocks. The real question we should be asking is: "Should Git have had commit generation numbers in 2005?". If the answer is "yes", we should put them in now before it becomes even harder, bending over backwards for backward compatibility if necessary. Otherwise, we'll regret this decision 10~15 years later, when we're faced with deeper issues. If you want a concrete example, think about how you'd compose DAGs together (again, the submodules problem): where is the information required to prune each DAG and compose? I wish I could write this in myself, but I'm afraid I don't have the engineering skill yet. I'll be happy to contribute whatever little I can, and participate in the review process. Thanks. -- Ram -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html