[This message, (yes,another long one from me), proposes 3 changes. The first should be uncontroversial I think, while the second and third are clear heresy, (and the second would require some amount of re-training or re-configuration by existing git user). Pick and choose as you see fit. I don't think they actually depend on each other, though I'll present them here as parts of a whole.] Change #1: Add "git stage" command, use "--staged" instead of "--index" ======================================================================= If we're going to start describing the index as a "staging area" let's make the command set reflect that as well. I propose a new "git stage" command that is intended for human use when wanting to do a staged commit. Then, a few other commands that currently have --index or --cached arguments could switch to --staged as well. With this change here is a summary of some of the primary git commands (that are relevant to the current discussion): add Shove a file's contents into git's staging area stage Shove a file's contents into git's staging area rm Remove a file from git's staging area diff Show what's changed in working tree compared to staging area diff --staged Show what's changed in staging area compared to latest commit commit Create a new commit from the contents of the staging area commit -a Update the contents of all files in the staging area, and create a new commit from the new staging area commit files... Create a new commit that differs from the latest commit only in files... (which get new content from the current working tree). Staged content of other files (if any) will not be committed. I hope that so far (in this email) I haven't said anything very contentious. This is basically just a summary of the existing behavior with things like "update-index" and "--cached" changed to "stage" and "--staged". The introduction of this new "stage" command would be a very minor change. If you're not particularly picky about names, it might be seen as having no impact at all, (or even slightly negative since "add" and "stage" could be considered equivalent). If you are picky about names you might consider it slightly better to "add" when adding a new file and to "stage" when you want to put some content into the staging area. OK, so now let me start in with my heresy[*]. To start with I'd like to group the above command into two groups such that one can be understood without a need to understand the purpose of the staging area. Note: the goal here is not to lie about the staging area. It will still be mentioned in the documentation for any command that needs to mention it, but in a way that a user can easily ignore those portions at first. So the grouping is: Without staging --------------- add rm diff commit -a commit files... With staging ------------ stage diff --staged commit So far, that's just a re-grouping. No names or semantics have been changed. Change #2: Make a staged commit an explicit act =============================================== The "-a" stands out to me here as the only command-line option needed in the first list, and the only command in the second list that performs a staged operation by default. So change number to is to redefine "commit" to mean what "commit -a" meant before and to require a new command-line option for staged committing, (the best naming I have so far is "commit --staged" with a shortcut of "commit -i"---the mismatch of "'i' as short for --staged" is a bit unlovely I admit). Here's what we have after change #2: Without staging --------------- add rm diff commit commit files... With staging ------------ stage diff --staged commit --staged (or "commit -i") Change #3: Change "add" to not stage any content ================================================ To finish off, I'd like to propose descriptions of the commands to allow the user to use the "without staging" commands as a complete set while being able to easily ignore any of the staging capabilities. This does trigger a need for a semantic change in the "add" command. Here are the proposed descriptions: Without staging --------------- add Add a file to be managed by git rm Remove a file to no longer be managed by git diff Show the changes in the working tree compared to the latest commit, (or compared to staged content, if any) commit Commit the current state of all git-managed files commit files... Commit the current state of the specified files With staging ------------ stage Shove the current contents of the specified files into git's staging area diff --staged Show the changes in the staging area compared to the latest commit commit --staged Commit the state of the current staging area commit -i To make the above work, I think Daniel's suggestion of making "add" put 0{40} into the staging area should work just fine. I know that Linus has religious objections to these proposed new semantics of "git add". One response there is to just consider "add" to be a mud-pit command for people to wallow in that really want it, (like Linus' proposed "ci" command). If you don't want to be in that mud-pit, then just use my "stage" command along with "commit -i", (or with "commit" and some configuration option, or with "commit" and a rejection to my change #2). Another response is that these new semantics for "add" really aren't any worse than other existing things in git, (for example, "git rm" isn't just updating file content into the index---because it even leaves the file around by default). [Actually, the fact that "git rm" doesn't delete the file by default is a bug (and it's my bug). I think the right thing is that "git rm" should be defined as always deleting the file from the working tree, and that it should be fixed to fail if the file if the file is dirty, (unless -f is passed)]. Other examples of the current semantics of git commands being just as "evil", (I would argue "usable" instead), are below. I think that here, finally, I've made my proposal as clearly and consistently as I can. I think the above would only improve git, (by making it easier to use by new people, while still providing a consistent model and a way to easily learn everything git has to offer). Change #2 would be the hardest pill to swallow since it would mean some change in the habits of existing users, (the other changes could largely be blissfully ignored by trained git users I think). This difficulty could be softened with a configuration option something like core.commitStagedByDefault, or this one change could be rejected. -Carl [*] I say heresy, but I think all the talk about "inconsistency" and "dishonesty" in the proposals I've been making are really misplaced. The easiest way to see that is to apply the same arguments to existing commands in git and see that they are already inconsistent and dishonest. Inconsistency ------------- If the consistent model is "'commit' commits the contents of the staging area" then what in the world is happening in the case of "commit files..."? There's really no way to describe that operation in terms of the staging area, because it simply ignores it. The closest you could get is to describe the internal implementation in detail: commit files... Creates a temporary staging area from the latest commit, shoves the content of the named files into that temporary staging area, creates a new commit from that and then does [something] to the original staging area. I (obviously) botched that. Somebody could write an actual, correct technical description. But you know what? It would be totally useless. It's really hard to describe what the current command does in terms of the staging area and nobody would care anyway. It wouldn't help anybody use the thing. The fact that all commit operations _do_ involve a staging area at some deep point in the implementation is totally irrelevant to the fact that what "commit files..." does do _is_ desirable, and is not hard to explain at a conceptual level. What the current documentation has is: "Commit only the files specified on the command line." This documentation doesn't say _anything_ about the content coming from the working tree rather than the index. But that's _obviously_ the correct place for the content to come from, and that's what's implemented. Dishonesty ---------- The argument here is that some "easier to use" commands lie to the user, giving them an incorrect idea of what's really happening, and that this will create barriers to later understanding. I think the same argument could be applied to say that there's no reason to have "add", "rm", "resolve", and "update-index" (or "stage"). These commands are all doing the same thing at a technical level, so why lie to the user and let the user think they are doing something different? My reply is that this isn't a lie, but it's providing names for the user that match the operations that the user is conceptually doing. That's called "providing a usable interface". If the user goes on to learn the internals and discovers that these are all wrappers around some shared core command, then the user can appreciate that elegance of implementation. But forcing everyone to _use_ one command for these conceptually separate arguments would be a mistake from the point-of-view of usability.
Attachment:
pgpSR4qP1Ypa8.pgp
Description: PGP signature