Re: [PATCH] make 'git add' a first class user friendly interface to the index

Carl Worth <cworth@xxxxxxxxxx> · Fri, 01 Dec 2006 22:54:04 -0800

On Fri, 01 Dec 2006 14:31:45 -0800, Junio C Hamano wrote:
>        "registering thing in the index".  We on the list are
>        just about to agree to give a good short name, "to
>        stage", for that action they have known about, in order
>        for us to make it easier to explain to new people.  That
>        should not affect the terminology the old timers are
>        accustomed to and and trained their fingers with
>        ("update-index", "diff --cached", "apply --index").

You can adopt a new, short name, and use it in both documentation
_and_ commands without breaking any habits. Just leave the
implementation of the old commands alone. You can even remove things
from the documentation, (or squirrel chunks away to "deprecated"
sections), if you're leaving things only for old-timers.

I've been _trying_ to make git easier to learn, and when there are two
commands, (update-index and "diff --cached"), that use different
terminology for the same concept, that's a road bump to learning.

Yes, _we_ all know that it's talking about the same thing. And we are
all an existence proof that people _can_ learn git as it is without
any changes. But I contend that more people could learn git more
easily if we worked to smooth things out like this.

But almost none of what I proposed should really make things harder on
experienced users. If we make the terminology of the command-set
consistent with the way we explain things in the tutorials and
documentation, then we're being that much nicer. I came up with
"stage" and "--staged" because over and over again I saw Linus and
other say things like "the index is easy to understand if you think of
it as a staging area."

Someone didn't like the use of "stage" as a verb. I'd be happy with
something else that's a nice, short verb that has a consistent
adjective to match. Currently, we have an unshort, non-verb
"update-index", a mismatched adjective "--cached" and a misplaced noun
"--index".

The proposal in the current thread of using "add" is an improvement on
the shortness side, and I am _delighted_ to see documentation
appearing that is focused on what the user wants to achieve and what
the user should expect to happen. So, Junio, please go ahead with
Nico's stuff here. It is an improvement over the current
situation. (And thanks, Nico, for fighting against having technical
details getting added to user-oriented documentation).

But I do still think it's a mistake to muddle the concepts of "adding"
a file and "staging edited content" for a file. In index terms, the
distinction is between adding a new path (and contents, of course) to
the index vs. just updating the contents for an existing path.

But it's not the index distinction that's interesting. It's that users
think of those operations differently. An "add" operation takes a
files out of the "untracked file" state as reported by git
status. That's a very different thing conceptually than updating the
contents of a file that is already being tracked by git. And if the
user thinks of an operation as being different, the command should
reflect that. There is a sense in which the user is always right here,
(since if the tool doesn't do what the user wants, the user just goes
somewhere else).

>        The option to 'git diff --cached' may need a new synonym
>        to make things consistent, but the new synonym should be
>        --index, not --staged.

I like consistency, so I agree that "diff --index" is an improvement
over "diff --cached", (and of course you can leave "diff --cached"
around forever).

The "--staged" thing only came up as I looked for a replacement for
"update-index" as a verb. We can just use "add" to, but it is a bit
awkward for the reasons I explained above.

> > maybe add a -f/--force argument to allow for adding ignored files
> > instead of going through git-update-index.
>
> Yup.

Yes, very nice.

> > maybe add --new-only and --known-only arguments if there is a real need
> > to discriminate between new vs updated files.  I would not suggest
> > against it though, because if someone really has such fancy and uncommon
> > requirements he might just use git-update-index directly at that point.

Please don't add --new-only and --known-only options to git add. The
fewer the options, the easier the command is for humans to learn.

The only place I can imagine --new-only or --known-only being useful
would be in a scripting situation, not manually typed on the command
line. And as you say, update-index already exists for that.

Please keep user-oriented commands focused on things that _users_
actually want to do.

> > +Contrary to other SCMs, with GIT you have to explicitly "add" all the
> > +changed file content you want to commit together to form a changeset
> > +with the 'add' command before using the 'commit' command.

I think we can explain the git model in positive terms that stand on
its own. People will learn the differences and appreciate how git is
better. So I'd just drop "Contrary to other SCMs". It's a really weak
form of pride to compare ourselves to other systems. We can do much
better by having the hubris to pretend no other systems exists.

> > +This is not only for adding new files.

I think this sentence shows the failing of the "add" naming. We having
to explicitly say here. Oh, and when we say "add" we don't mean what
you think of as "add", we mean something else. If we mean something
else, why don't we just call it something else?

> I think there is another twist more deserving of mention than -i twist.
> If you jump the index using --only, what is committed with that
> commit becomes part of what is staged for the commit after that,
> and in order to prevent data loss, we disallow this sequence:
[...]
> So if we allowed the above sequence to succeed, we would commit
> the result of the second edit, and after the commit, the index
> would have the result of the second edit.  We would lose the
> state the user wanted to keep in the index while this commit
> jumped the index, and that is why we disallow it.

Wow, this index stuff sure takes a lot of explaining. Why are users
better off having to grasp all of that stuff before they can
successfully add; edit; #oops, add again; and commit their files?

> I wonder if this sequence should do the same as "git rm -f foo":
>
> 	$ /bin/rm foo
>         $ git add foo

Argh. Please no. Update-index already exists. Let's not push all of
its semantics onto "add". Let's use "add" for when the user _actually_
wants to _add_ a file. Please? please?

> That's one of the reasons I suggested 'checkin' instead of
> 'resolve', 'resolved', etc.  You check-in the removal of the
> content from that path to the staging area, to go as a part of
> the next commit.

I think having "checkin" as a non-synonym for "commit" would be a big
mistake for new users. Different systems out there use those terms
interchangeably. Since git has something unique in its "index" or
"staging area" we're much better off sticking to unique terms for
describing it.

> "use git-add to mark for commit, or use commit -a"?
>
> I think the one source of confusion is "update-index" sounds as
> if it is a command to "update the index" and as if you can leave
> out "with what?" part to complete the order to the command.

Yes. Jesse Keating, for example, read the "use update-index"
suggestion from git-commit and was very confused why he didn't succeed
when he thought he was following instructions with:

	git update-index
	git commit

Maybe the above could be:

	Use "git add <files...>" then "git commit",
	or "git commit -a" to add and commit all tracked files.

But why are we even directing to "git add" here instead of just:

	What would you like to commit?

	Use "git commit <files..>" to commit some files
	or "git commit -a" to commit all files.

This doesn't teach the two-part, staged commit to the user at this
point, but that's perhaps OK.

Except it does still leave open the user confusion of:

	git add file1
	git commit
	"cool, that works"

	edit file1
	git add file2
	git commit
	"hmm, why didn't file1 get commited that time?!"

And the only answer we can give to the poor user is:

	Oh, "git add", (and "git commit" for that matter) don't do
	what you think they do. Go read the documentation and try
	again.

At least, with this latest round of updates, the "git add"
documentation will actually explain this stuff, (and not just say
"this is a wrapper for update-index"). But there are still a lot of
users that will say "I have to add the file over and over again?
That's bizarre." They won't be saying, "Oh, the git designers were so
brilliant to implement a system based entirely on file contents and
never treating filenames as an interesting entity separate from
content. Thank you so much!"

So, git still isn't "usable" by just picking up the commands and
running with them, learning more as they go along. Some potential
users get lost here. And that's too bad, because nothing in git's
model, (or even in functionality already existing in the command set),
is missing compared to what the user wants. They just didn't find it
by default.

Git _will_ be more learnable from the documentation, but it will still
leave a lot of users thinking it makes some simple things harder than
they should be. So other potential users get lost here. And that's too
bad too, because if they would stick with it a little, they could
learn things later on where git would make complex things simple,
(like conflict resolution).

If add really were uniquely about _adding_ files to be tracked,
(rather than just a short synonym for update-index), and if we tweaked
the default behavior of git-commit, we could fix these things. And
all the model and power of git would still exist and be ready to be
learned by anyone that wants it, (rather than only by those who manage
to get past snags like these).

-Carl

PS. Is there a twelve-steps program for people who can't let a thread
die? I really want to stop, and I keep telling myself I can stop
anytime I want.
Attachment:
pgpGTjWifJz6B.pgp

Description: PGP signature