2009/4/26 Björn Steinbrink <B.Steinbrink@xxxxxx>: > On 2009.04.25 15:36:24 -0400, David Abrahams wrote: >> Where it's relevant when the user notices that two distinct files have >> the same id (because they happen to have the same contents) and wonders >> what's up. >... > And why would your implementation save the same object twice, in two > distinct files? This question makes me think that you don't understand the parent's point. He's not talking about implementation details; in fact, there's no reason to mix the git world and the file system world at all in this discussion. David is pointing out that a user might notice that two different trees list the same blob. This can be startling if you have incomplete picture about what's going on. >From a practical point of view, you might argue that not too many people are looking at trees and blobs; however, it seems to me that most people are afraid to use any of git's most useful features precisely because they don't understand the git model and they don't understand that nothing is ever lost unless you explicitly clean up unreferenced objects---they don't see how easy it is manipulate their repos. I argue that if they are given the full knowledge of git's concepts, then they will be able to reason about their repo actions with confidence, even if they only work with commits. I think the key is to stress in the documentation the idea that there are 2 separate worlds (the git object world and the working directory's file system world) and that the git tools provide an interface between them; this seems like a small and unnecessarily academic point, but I believe that it's important to working with confidence. > ... > You can't have two objects with the same contents to begin with, same > content => same object. You can just have that one object stored > multiple times in different places (for sane implementations this likely > means that you have more than one repo to look at, and each has its own > copy of that object, but that's nothing you as an user should have to > care about). Indeed it's nothing you should care about. It's an implementation detail again; theoretically, every repo is in the same git world where all git objects are stored---in a sense, a particular repo state is itself an object of this world. > It's an identity relation: same name/id => same object. Unlike e.g. a > hash-table where you are expected to deal with collisions, and having > the same hash doesn't mean that you have identical data. However, having the same *cryptographic* hash does mean that you have identical data. The overall point is this: The documentation should force people to learn the right ideas, so that they can have confidence to think beyond blog-post templates for using git. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html