Re: [doc] User Manual Suggestion

Michael Witten <mfwitten@xxxxxxxxx> · Sun, 26 Apr 2009 11:36:04 -0500

2009/4/26 Björn Steinbrink <B.Steinbrink@xxxxxx>:
> On 2009.04.25 15:36:24 -0400, David Abrahams wrote:
>> Where it's relevant when the user notices that two distinct files have
>> the same id (because they happen to have the same contents) and wonders
>> what's up.
>...
> And why would your implementation save the same object twice, in two
> distinct files?

This question makes me think that you don't understand the parent's
point. He's not talking about implementation details; in fact, there's
no reason to mix the git world and the file system world at all in
this discussion.

David is pointing out that a user might notice that two different
trees list the same blob. This can be startling if you have incomplete
picture about what's going on.

>From a practical point of view, you might argue that not too many
people are looking at trees and blobs; however, it seems to me that
most people are afraid to use any of git's most useful features
precisely because they don't understand the git model and they don't
understand that nothing is ever lost unless you explicitly clean up
unreferenced objects---they don't see how easy it is manipulate their
repos. I argue that if they are given the full knowledge of git's
concepts, then they will be able to reason about their repo actions
with confidence, even if they only work with commits.

I think the key is to stress in the documentation the idea that there
are 2 separate worlds (the git object world and the working
directory's file system world) and that the git tools provide an
interface between them; this seems like a small and unnecessarily
academic point, but I believe that it's important to working with
confidence.

> ...
> You can't have two objects with the same contents to begin with, same
> content => same object.  You can just have that one object stored
> multiple times in different places (for sane implementations this likely
> means that you have more than one repo to look at, and each has its own
> copy of that object, but that's nothing you as an user should have to
> care about).

Indeed it's nothing you should care about. It's an implementation
detail again; theoretically, every repo is in the same git world where
all git objects are stored---in a sense, a particular repo state is
itself an object of this world.

> It's an identity relation: same name/id => same object. Unlike e.g. a
> hash-table where you are expected to deal with collisions, and having
> the same hash doesn't mean that you have identical data.

However, having the same *cryptographic* hash does mean that you have
identical data.

The overall point is this: The documentation should force people to
learn the right ideas, so that they can have confidence to think
beyond blog-post templates for using git.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html