Re: [PATCH 0/2] Making "git commit" to mean "git commit -a".

Carl Worth <cworth@xxxxxxxxxx> · Wed, 29 Nov 2006 15:37:28 -0800

On Tue, 28 Nov 2006 23:44:57 -0800, Junio C Hamano wrote:
> Nicolas Pitre <nico@xxxxxxx> writes:
>
> > This argument has its converse.  What you should _not_ have to worry
> > about all the time is whether your index really includes all the changes
> > you want included in your next commit.
>
> That's what we have "git diff" with various output options for;
> I often do "git diff --stat" or "git diff --name-status" when I
> know I am about to commit in a dirty working tree.  I suspect
> that I am not getting your point.

I think you just backed up the point you didn't understand. You said,
"when I know I am about to commit in a dirty working tree". That's an
exceptional thing to do, (from the point of view of your "pure
discipline"), so you will always be conscious of doing it when you do.

> > And whether wanting to leave local changes in the working directory
> > without commiting them actually happen more often than wanting to commit
> > every changes is arguable.
>
> I do not think anybody is talking about which happens more
> often.

That's exactly what I'm trying to talk about here.

I've been using your "commit -i and -a as default" patches since you
sent them out, (thanks again for sending them). I will readily admit
that the very first commit I wanted to make was a "partial, dirty
tree" commit.

What happened was that I had made a couple of independent changes in
concert and now I wanted to commit them separately. I think that in my
usage, this is the most common case for me using update-index and
commit rather than "commit -a". It also happened that the independent
changes modified disjoint sets of files.

Now, the index works perfectly for this kind of situation, and I like
it. Some of the people that I know, (people who are currently refusing
to touch git, and people for whom I've been trying to be a proponent
in this discussion), would just use "git commit file1 file2 ...." in
this situation. I don't do that since I really like being able to
identify the files in one command, (update-index), and then preview
the commit I'm going to make before I do so, (with "diff
--cached). Sure, I could just do the commit, do an after-the-fact
review with "show" and then reset if I screwed up, but that just feels
like the wrong way to go about it.

So, I'm an index lover here. I see how it's useful and I use it on a
regular basis. But I do believe I make "clean index" commits more
frequently. And regardless, I still think we should change the default
for git-commit to make it easier for users to learn git.

[As an aside, the situation of independent changes being mixed in a
working tree is not always so lucky as to be cleanly separated into
disjoint file sets. When it's not, I have to disentangle them. Now,
the index could really help during this operation too, but we would
need better tools than update-index which only works on a per-file
basis. Something that let me easily select chunks of the patch would
be really nice.]

> "screw the index" people do not have to worry about the
> index during the course of their changes in the working tree
> toward the next commit, and the only time they need to tell git
> (which _IS_ a system based on the index, dammit)

I don't think "screw the index" is an accurate way to characterize my
position at least. [In fact is "_the_ index" even a good way to
describe what git has? Doesn't a commit operation that lists explicit
files build up an alternate index which it will throw away if there's
a failure at some point?]

So some git operations work by creating a tree object by first putting
some state into an index file. That's fine. And users can even take
advantage of doing that themselves if they want to. That's fine too.

All I'd like is that new users didn't have to learn those concepts as
early as they have to do with current git.

Now, one time git really does have "_the_ index" and when the user
really needs to know about it is when there is a conflicted merge. As
Linus has been pointing out recently, there are some important
benefits that the index provides at this point. And I think this is
the right time to teach users about the index. I think they can learn
it and take advantage of it at that point, (and I wouldn't even expect
them to necessarily want to change the configuration of what "git
commit" means).

In general, the process of resolving a conflicted merge is poorly
presented to the user by git. The commands that leave the conflicted
index, (git pull, git rebase, etc.) and that examine it, (git status),
don't do a good job of telling the user what to do (git update-index)
and how to examine the situation (git diff --merge). I think this
could be improved a lot, and one piece of that is the "git resolve"
thing I've been proposing recently.

> It might make sense to have a configuration in .git/config that
> says "user.workingtreeistheking = true".  This should obviously
> affect what "git commit" does by default,

I'd love to see a configuration value added, but I don't think we gain
anything at all if we add a configuration value and leave the default
value the same. It's not easier to tell new users to configure git to
work this way then it is to tell them to use "commit -a".

>                                            but it also should
> change the behaviour of other commands to suit the "screw the
> index" workflow better.
>
> For example, the configuration should probably make "git diff"
> (without an explicit --cached nor HEAD) pretend it was asked to
> show diff between HEAD and the working tree, because the user
> chose not to care about the index.

Actually, I strongly disagree on this point. Months ago, before I
understood the index as well as I do now, I did argue for a change
like that, since I thought the index was just confusing. But I think
it would be a mistake to make two fundamentally different models creep
through all the tools. (Which is to say, I'm agreeing with what I
think your motivation is for pushing back against that kind of thing.)

Plus, this model would be just broken anyway. The problem is that "git
diff" meaning "git diff HEAD" works just fine when you're in a
"normal" situation. But as soon as you're looking at a conflicted
merge, then "git diff HEAD" isn't useful at all, but the difference
between the index and the working tree is very useful. In that
situation, the behavior of "git diff" suddenly makes a lot of sense,
and the index has its chance to shine. This can lead to a nice
epiphany for users I think.

                                    Not caring about the index
> is different from consciously keeping the index clean;

Yes, those are different. And I agree with you that a "pretend the
index doesn't exist" mode would not be an improvement.

What I would like to see instead is that all commands keep the index
clean by default, (with the notable exception of failed merge). And
the only way to get it dirty would be with an explicit option such as
"apply --index".

See? If users only get a dirty index by asking for it, then the
likelihood of confusion goes way down, as users who haven't heard of
"the index" certainly won't have any reason to pass a "--index" option
around.

> Would that make people happy?  I do not think so.  I think it
> will lead to more confusion to have two majorly different
> semantics in the same set of tools.

Agreed. Let's not go there.

I think what I'm asking for is a much more mild change. The "keep the
index clean" behavior exists almost everywhere already, (the few
exceptions are things like "git cherry-pick -n", and the notable
exception of a conflicted merge). So I don't think supporting "commit
-a by default" means we have to introduce a large conceptual change.

Also, for the case of the conflicted merge, the index really is a key
part of what lets the user work through things. But there's really not
an _essential_ need for the user to fully grok the index to take
advantage of that. It's not hard to describe the behavior of "git
diff" and "git diff --merge" in terms of things that the user wants to
know, without mentioning the index at all. (Now, if a user asks, "how
is git able to tell me all this amazing stuff", then would be a great
time to explain the index, I think).

Did I succeed in getting you to consider anything new here? I hope
so. I don't want you to feel like we're both just saying the same
things back and forth over and over with no progress.

-Carl
Attachment:
pgpXUGQoxqepE.pgp

Description: PGP signature