On Wed, 06 Dec 2006 10:31:21 -0800, Junio C Hamano wrote: > I am not sure what needs to be commented on at this point, since > it is not yet clear to me where you want your proposal to lead > us. Thanks for the comments you made here---that's the kind of thing I was looking for. As for where I'm trying to lead us, what I really want to do is to help improve the learnability of git. A big part of that is about improving the set of "use-oriented" documentation, (which describes how to achieve tasks, as opposed to what might be termed "technically oriented" documentation which describes how individual tools work). I think too much of the existing documentation falls into the second class. A parallel thread is already talking about some of the important organizational aspects of use-oriented documentation. And I agree with that thread is that the short "attention span" is a primary consideration for this kind of documentation. The user has a task to be accomplish, and any text or concepts that aren't contributing to the solution of that task should be eliminated. Note that when I talk about eliminating unnecessary concepts, I do not mean lying to the user about the underlying model or any concepts. We can't have a sugar-coated tutorial that says one thing, and then expect users to "unlearn" that if they go deeper into the reference manual. That's a recipe for disaster. Also, when I say "use-oriented" I'm not suggesting that the documentation be shallow. It can go as deep as any workflow we care to document and introduce whatever concepts of git are necessary to support that workflow. (There is, though a level at which "technically oriented" documentation is all that's needed, or even desired, and that's when the documentation is targeting authors of interfaces that build on top of git---not users trying to use git to get work done at the command line). OK, so if my concern is all about documentation, then what am I doing proposing new commands or new ways of thinking about existing commands rather than just sending documentation patches? The problem is that the current semantics of the following variations of "git commit": git commit git commit -a git commit paths... defeat the goal of writing good, clean use-oriented documentation. So there's some adjustment that should be made first. And I don't even care what the adjustment is, (for example, it doesn't have to be "commit -a by default"), but please recognize the problem and help me come up with an acceptable way to fix it. To demonstrate, let's take the simplest of use cases and try to document it in as clear a way as possible. Let's imagine we're in a tutorial where we've just guided the user to making modifications to several existing, tracked files, (starting from an initial clone, not an init-db), and the next task to teach the user commit for the first time. We would like to document both "commit a single modified file" and "commit all modified files". Here are two approaches that I can come up with: 1. Any commit involves first "add"ing together new content, and then committing the result. For example to commit a single file: git add file # add new content from file git commit # commit the result As a shortcut, "commit -a", (or --all) can be used to automatically "add" the content of all tracked files before the commit. So the common case of committing all tracked files is as easy as: git commit -a # commit content of all tracked files 2. The new content of modified files can be committed by naming the files on the "git commit" command line. For example: git commit file # commit new content of file As a shortcut, "commit -a", (or --all) can be used to commit the content of all tracked files: git commit -a # commit content of all tracked files Neither of the above is totally satisfactory. In (1) the user is not presented with a framework that will make sense of "git commit files...". The expansion of "-a" as "--all" could easily give the user the impression that "git commit files..." is a shortcut for "git add files...; git commit", but that's wrong and could lead to unexpected results and confusion. In (2) the user is not presented with a framework that will make sense of "git commit" with no arguments. The user is left to wonder about why the --all is needed and what it means exactly, (particularly since "git commit" also commits the content of all tracked files. Various fixes have been proposed for these potential confusions. For example, making "git commit files..." default to the behavior of --include instead of --only would eliminate the confusion I described for (1). And making -a the default for "git commit" would eliminate the confusion I described for (2). However, actually implementing either of those fixes would then break the initial "commit one file" example from the other approach. Because of that, the conversation has often fallen into debate over whether (1) or (2) is the "one true way" to describe git, and which one leads the user to have an incorrect mental model. But I think that debate is misguided since both descriptions are worthwhile and valid. (1) is based around an explanation of what "git commit" does, and (2) is based around an explanation of what "git commit files..." does. And both of these commands are very useful exactly how they are. It's almost coincidental that "commit -a" fits in logically with either description. So what I was trying to get across in this latest thread is that git's command-line interface already has two slightly different models for what's going on in a commit. You don't agree with me on that point yet, (more on that below in my reply). I really don't care what the final fix is, but I would love to see documentation with no more complexity than the above that accurately captures the useful functionality. And I don't actually have a concrete proposal for a fix yet---I was just offering the commit-index-content and commit-working-tree-content ideas as ways to think about the issue. Maybe the two documentation blurbs above capture it in a better way. Do you feel like you have a better understanding of what I'm trying to do now? > I do not agree with your "three commands" or "two semantics" > characterization of the current way "git commit" works. "git > commit" without any optional argument already acts as if a > sensible default arguments are given, that is "no funny business > with additional paths, commit just what the user has staged > already." I agree that "git commit" does nothing funny by default. What I was pointing out is that "git commit" and "git commit paths..." do not have the same semantics. There's really nothing to debate about there. There is no argument you can substitute for <paths...> to give you identical behavior as "git commit". That's a fact. > "git commit" is primarily about committing what has been staged > in the index, and "--all" is just a type-saver short-hand (just > like "--include" is) to perform update-index the last minute and > nothing more. In other words, "--all" is a variant of the > pathname-less form "git commit". It is not a variant of "git > commit --only paths..." form, as you characterized. I hope the documentation blurbs (1) and (2) above show how "commit -a" can be seen as a variant of either "commit" or "commit files...", (which themselves are both useful semantics, but demonstrably distinct). > The pathname form (the "--only" variant) on the surface seem to > work differently, but when you think about it, it is not all > that different from the normal commit. We explain that it > ignores index, but in the bigger picture, it does not really. No, it really is different. > the first commit does "jump" the changes already made to the > index, but after it makes the commit, the index has the same > contents as if you did "git update-index a b" where you ran that > "git commit". In other words, it is just a handy short-hand to > pretend as if you did the above sequence in this order instead: How could you document "git commit files..." as a shorthand? A shorthand for what exactly? A shorthand for pretending you didn't just type the commands you did type that got the index into its current state, but had instead typed different commands before the commit and other commands afterwards? That's crazy. That's not a shorthand. That's just plain different semantics. The current "git commit files..." command never does commit the contents of "the" index as a concept presented by "git commit". (This is independent of the fact that the implementation of "git commit files..." certainly does use an index file somewhere and uses it to create a commit object in the same way that "git commit" uses "the" index). > So I actually think it is a mistake to stress the fact that "git > commit --only paths..." seems to act differently from the normal > "git commit" too much. I think that would be lying to the users and setting them up to get confused later. I discussed this above as the confusion that can result with the explanation of (1). If you teach "git commit" as commiting "the" index, and de-emphasize that "git commit files..." is semantically distinct, then how is a user ever supposed to learn what it is that "git commit files..." is actually doing? > In short, while I understand that your "proposal" shows your own > way to summarize the semantics of "git commit", I am not seeing > what it buys us, and I do not see the need to come up with a > pair of new two commands for making commits (if that is what the > proposal is about, that is, but it is not clear to me if that is > what you are driving at). I think it would only confuse users. Forgive me again for being obtuse. I don't think we should necessarily add two new commands. I was trying to illustrate a problem in the existing command set, and propose a new way of thinking about the tasks that the current commands help a user to perform, (committing content from the working tree or committing content from the index). I don't actually have a concrete proposal for how to take that way of thinking and map it to a command set, (and one that would disrupt current git users as little as possible). I'd love to have some help with that part. > Is it just me who finds the above a very much made-up example? Fine. We can ignore that example. > In any case, I should clarify my aversion to partial commits a > bit. What is more important is to notice that, while you cannot > compile-and-run test what is in the index in isolation (without > a fuse that exports the index contents as a virtual filesystem > -- anybody interested?), you _can_ preview and verify the text > that is going to be committed by comparing the index and the > HEAD. And for that, your "staging" action (i.e. Nico's "git > add") needs to be a separate step from your "committing" action. Yes, I often use the index as a place to preview things. And it is true that I find myself using update-index when I could have used "commit paths..." precisely because I can preview it once more. But I do use the "commit paths..." form at times as well. If I have just reviewed things in "git diff" and there are _really_ obviously separable pieces I will commit them alone with staging into the index and reviewing again. It's probably the case that I skip the explicit staging and extra preview when I can use a single pathname as the argument to "git commit". > In other words, I would even love Johannes's "per hunk commit" > idea, at least if it had an option to preview the whole thing > just one more time before committing, and I would love it better > if it had an option for not committing but just updating. Yes! I've wanted tools to help with per-hunk separation before, but since I'm so likely to make mistakes while doing that I would only want that to go into the index so that I could review it before committing. I guess I might need a per-hunk way to fix up my mistakes too if I put a hunk into the index that I didn't want to be there. -Carl
Attachment:
pgpfIEhsBERWU.pgp
Description: PGP signature