Re: Command-line interface thoughts

Michael Nahas <mike.nahas@xxxxxxxxx> · Sat, 4 Jun 2011 21:00:33 -0400

Thanks for your reply, Jakub.

On Sat, Jun 4, 2011 at 5:49 PM, Jakub Narebski <jnareb@xxxxxxxxx> wrote:
> Michael Nahas <mike.nahas@xxxxxxxxx> writes:
>
>> Quick list of recommendations:
>>
>> 1. Pick aliases for the next commit and the working tree that act like
>>    commits on the command-line.
>
> No go.  This was asked for many times, and each time shot down.
> Those "aliases" / pseudo-refs looks like commits but do not behave
> exactly like commits.  This would increase connfusion.

I'm glad it was discussed.  I think users would know that those
commits were special (they are writeable after all), but I'm sure more
informed people than I made the same arguments.

> See also gitcli(7) manpage for description of --index and --cached
> options (and other git command line conventions).

Thanks for the pointer.  I've now read it.

>> 2. Adopt a (semi-)formal notation for describing what commands do.
>
> Whom it would help?  Not an ordinary user.

I think it would help experts in discussing exactly what happens.  For
ordinary users that are hitting an intricate case (or don't know
English very well), it would be good if there was something that would
tell them mathematically what occurs.

>> 3. Move the operations "checkout -- <file>" and "reset -- <file>" to
>>    their own command names
>
> Proposed "git unadd <pathspec>..." doesn't cover all features of
> "git reset <rev> -- <path>" nor "git checkout [<rev>] -- <path>".

I'm confused.  How can it not cover all the features?  I'm just
suggesting renaming the command.  From "git reset -- <path>" to "git
unadd [--] <path>".  (And renaming "git checkout -- <path>" to some
yet-to-be-named other command.)

>> 4. Deemphasize the "branch" command for creating branches.
>
> Or add "git branch --checkout <newbranch>".

Would that operation be the different from the existing "git checkout
-b <new branch>", or just another way to write it?

>> A "normal" (long) email follows.  At the end are examples of commands
>> in a not-quite-so-formal notation.
>
>> ------------
>>
>> I was the primary designer of the PAR2 open file format and write a
>> lot of big software (application-layer multicast, etc.).  I've been
>> using Git for 2 months.  I love it and I greatly admire the plumbing.
>> However, the default "porcelain" has at times been more confusing than
>> enlightening.
>
> BTW. have you read gitcli(7) manpage?

I have now.  I'd swear I've read close to 30 manpages but never
heard-of/noticed that one until you mentioned it.  Thanks for the
pointer; it's good to have --index and --cached clarified.

>> I had some ideas about the porcelain and decided they were worth
>> sending to the mailing list.  I ran the ideas by the two Git gurus who
>> answer my questions and they agreed with them.  I wish I had the time
>> to implement them but I did PAR2 when I had time off and I'm working
>> now.  I apologize if any of these are repeats or have already been
>> discussed to death.
>>
>>
>> My recommendations are:
>>
>> 1. Pick aliases for the next commit and the working tree that act like
>> commits on the command-line.
>>
>> By "next commit", I mean "the commit that would be generated if the
>> "commit" command was run right now".  "Next commit" is not the same as
>> the index.  The index is a _file_ that serves multiple purposes.
>> (Think of it's state during a conflicted merge.)  But the index does
>> usually hold the files that change between HEAD and the next commit.
>>
>> For the alias for the next commit and working tree, I suggest "NEXT"
>> and "WTREE".  Creating these aliases will make the interface more
>> regular. It will remove oddities like "diff --cached FOO" and replace
>> them with "diff NEXT FOO" and mean that "diff" and "diff FOO" can be
>> explained as "diff WTREE NEXT" and "diff WTREE FOO".
>
> This idea ws proposed multiple time on git mailing list, and every
> time it was rejected.
>
> The problem is first, that you make INDEX / STAGE / NEXT and
> WORK / WTREE *look* like commits (like pseudo symbolic refs), while
> they do not *behave* like commits.
>
> "git show HEAD" looks differently from "git show NEXT" or "git show WTREE".
> Neither the index now working tree have a parent, or author, or commit
> message.  The index (staging area) can have stages, though you sidestep
> this issue by handwaving it away.  Working area has notion of tracked,
> untracked ignored and untracked not ignored (other) files.   Etc., etc.

I knew of some of these issues and I agree that I was handwaving them
away.  I didn't have all the answers and certainly didn't want to
appear to be claiming to have them.  I assumed that if the idea had
merit that these issues could be worked out.

That this idea has been brought up multiple times says that it does
have some merit.  But apparently not enough merit.

> BTW. both index and worktree have their own "aliases", namely ':0:'
> for index (stage 0), and ':' or ':/' for top tree.

Really?  Where can these aliases be used?

> Second, it doesn't solve issue of needing --cached and/or --index
> swiches completely.  Those pseudo-almost-refs hide them for "git diff",
> "git grep", "git ls-files", perhaps "git submodule" where we *read*
> from index, but not for "git apply", "git rm" or "git stash" where
> those swicthes affect *writing*.

I agree with you that it would not get rid of all switches.  I never
expected it to.  My major aim was to simplify things like the "diff"
command, which I have trouble remembering the different variations of.

>> 2. Adopt a notation for describing what commands do.
>>
>> I am sure in developer discussions there are descriptions of the
>> "commit" command as something like:
>>    HEAD = new(HEAD + (NEXT-HEAD))
>>    NEXT = HEAD
>
> Basic algebra fail
>
>  HEAD + (NEXT-HEAD) == NEXT
>
> Besides "git commit" creates commit from state of index, no diffing or
> patching is involved.

I would claim that the "state of index" is an approximation of
(NEXT-HEAD).  Also, the new Tree and Blob objects that get written
during the commit are another approximation of (NEXT-HEAD).  Neither
of these is exactly a patch applied to HEAD, but that's the intent I
was going for with my algebraic identity.  (It's not a fail; it's an
unoptimization!)

>> Where "-" creates a patch between versions and + applies a patch.  Git
>> already has some operators like "^", which refers to the parent of a
>> commit. Those are useful for defining things like "commit --amend":
>>    HEAD = new(HEAD^ + (NEXT-HEAD^))
>>    NEXT = HEAD
>
> Which is again not true.

[Addressed below, where the "what if HEAD is a merge commit with
multiple predecessors" is mentioned.]

>> Having this notation and using it in the man pages will make the exact
>> nature of the operation clear. (Right now, it takes a lot of reading
>> to figure out what happens to NEXT with the various command-line
>> options of "reset".)
>
> It's not that difficult: only "git reset --soft [<rev>]" doesn't
> affect index.
>
> Hrmmm... how this notation would explain differences between
> "git reset --hard", "git reset --keep" and "git reset --merge"?

I don't understand what "git reset --keep" and "git reset --merge" do.
 I've read the manpage but am still confused.  One of my reasons for
suggesting a notation is so that there is a clear mathematical
representation of what the commands do.  Once I understand them, I can
make an attempt at a notation that can explain them.

>> Currently, to understand what commands do, I use "A Visual Git
>> Reference", which has been extremely valuable to me. Kuddos to Mark
>> Lodato for it.
>> http://marklodato.github.com/visual-git-guide/index-en.html
>
> Unfortunately manpages cannot really include images.  Well, there is
> some kind of obscure graph description language for manpages ('dot' or
> something like that), supposedly, IIRC...

The manpage for "git checkout" has some ASCII art of commit DAGs.
It's almost there...

>> [I've included git commands in a not-formal-enough notation at the end
>> of this email.]
>
> NEVERTHELESS some kind of semi-formal notation might be useful.

I'm glad you agree.  Do you think my not-formal-enough notation is a
good start or do you want to propose another notation to start from?

>> 3. Move the operations "checkout -- <file>" and "reset -- <file>" to
>> their own command names
>>
>> This is my biggest and most important suggestion.
>>
>> "checkout -- foo.txt" copies foo.txt from NEXT to WTREE. Similarly,
>> "reset -- foo.txt" will copy foo.txt from HEAD to NEXT.
>
>  "checkout HEAD -- foo.txt" copies foo.txt from HEAD to NEXT and WTREE
>
>  "checkout HEAD^ -- foo.txt" copies foo.txt from HEAD^ to NEXT and WTREE
>  "reset HEAD^ -- foo.txt" copies foo.txt from HEAD^ to NEXT
>
>> These are operations to designate/undesignate files in the next commit
>> and should be grouped with others like them: "add", "rm" and "mv". (In
>> fact, the man page for "reset -- <file>" even says that it is the
>> opposite of "add"!)
>>
>> When these file-based operations are removed from "checkout" and
>> "reset", the purposes of those commands becomes clearer: "checkout"
>> changes HEAD to a new branch and "reset" moves the current branch
>> pointer to a different commit.  These operations may share code with
>> the operations "checkout -- <file>" and "reset -- <file>", but they
>> serve different purposes from the user's perspective and the user
>> should have different names to refer to them.
>>
>> As for naming these new commands, the "yet-another-porcelain" renames
>> "reset -- <file>" to "unadd", which I like very much.
>
> Well, that goes counter to reducing number of commands, but I quite
> like this name.  Though "unadd <revision> -- <file>" looks a bit
> strange...

I agree, that does look strange.  I think it would be the far less
frequent usage, but still strange.

>> For the other, my best suggestion is "head-to-next", but I'm sure
>> someone can do better.
>
> I'd rather remember that "git checkout" is about checking out
> something to a working area.

Now that I've separated these two usages of "checkout" in my brain,
"git checkout <branch>" is all about changing to a different branch.
That files in the working tree change is just incidental to moving to
the new branch.

The manpage paragraph for "git checkout -- <file>" has in bold that
this usage "does not switch branches".  So, for me, it's a completely
different usage and should be a different command.

I wish I had a reasonable name to suggest for the new command.

>> 4. Deemphasize the "branch" command for creating branches.
>>
>> I assumed that the command "branch" was used for creating branches.
>> After all, that's how it is done in the "gittutorial(7)" man page.
>
> It _is_ used to create branches.  But perhaps we should update
> gittutorial(7) (and check users manual)...

Thank you.

>> However, after reviewing all the major commands, I find that it is the
>> _last_ way I want to create a branch. It creates a new branch, but it
>> doesn't switch HEAD to the new branch!
>
> "checkout -b" is just shortcut for "branch" + "checkout".  Very
> convenient one, that is...

Yes, it's my primary way of making a branch now.

>> The commands that should be emphasized are "checkout -b <name>",
>> "commit -b <name>", and "stash branch".  These make sense in normal
>> git usage. The "branch" command has its uses but it is not usually the
>> way you want to create a branch.
> [...]
>
>> ----
>>
>> These are just some commands written in a not-quite-formal notation.
>> This notation doesn't handle a detached head, adding directories, the
>> state after a conflicted "stash pop", etc.  Still, as it is, I think
>> it's very informative to users for getting the gist of what command
>> does.
>>
>> "add foo.txt"
>>    NEXT:foo.txt = WTREE:foo.txt
>
> What about "add --intent-to-add foo.txt"?  What about "add <directory>"?
> What about resolving merge conflicts?

Good points.  These are all interesting cases that a fully developed
formal notation should make sure to address.

>> "rm foo.txt"
>>    delete(NEXT:foo.txt)
>>    delete(WTREE:foo.txt)
>> "rm --cached foo.txt"
>>    delete(NEXT:foo.txt)
>> "/bin/rm foo.txt"
>>    delete(WTREE:foo.txt)
>
> O.K.  Note however that "git rm foo.txt" on conflicted entry would
> clean up conflict.

Yes.  "git add foo.txt" is also used to resolve conflicts.

>> "mv foo.txt bar.txt"
>>    WTREE:bar.txt = WTREE:foo.txt
>>    NEXT.bar.txt = WTREE:foo.txt
>>    delete(WTREE:foo.txt)
>>    delete(NEXT:foo.txt)
>
> O.K., but what is important are atomicity and safety checks.

I think it's best to assume every operation is done atomically.
(Right?)   I'm not sure how to denote safety checks or prerequisites.

>> "checkout -- foo.txt"
>>    WTREE:foo.txt = NEXT:foo.txt
>> "reset -- foo.txt"
>>    NEXT:foo.txt = HEAD:foo.txt
>
> Those are not the only modes.
>
>> "commit"
>>    HEAD = new(HEAD + (NEXT-HEAD))
>>    NEXT = HEAD
>
>   HEAD + (NEXT-HEAD) == NEXT
>
> "git commit" doesn't apply patches.

Agreed.  I addressed this above.

>> "commit --amend"
>>    HEAD = new(HEAD^ + (NEXT-HEAD^))
>>    NEXT = HEAD
>
>  HEAD^ + (NEXT-HEAD^) == NEXT
>
> "git commit --amend" works correctly even if HEAD is a merge commit!

Another good issue.  A formal notation will need to specify how to
deal with cases of a commit with more than one predecessor.

>> "checkout FOO" (prequires WTREE==NEXT==HEAD)
>
> No such requirement.  It's all about which files differ between HEAD
> and FOO.  If you start working on some file, and decide that you
> should have made the change on different branch, "git checkout FOO"
> allow to move to FOO branch... assuming that changed file has the same
> contents in HEAD and in FOO.

Okay.  I will have to think about how a formal notation can denote that...

> End there is "checkout -f" and "checkout -m".
>
>>    WTREE = FOO
>>    NEXT = FOO
>>    HEAD ::= FOO // changes the alias of HEAD to refer to FOO
>
> And this is supposed to be easier to understand?

"checkout" is a very simple command to describe in English, so the
mathematical description will be more convoluted.  I don't (yet)
understand some of the variants of "git reset" even though they are
written in English.  I'm hoping a formal notation will make them
easier to understand.

>> "stash save"
>>    STASH = new(new(HEAD+(NEXT-HEAD))+WTREE-NEXT)
>>    NEXT = HEAD
>>    WTREE = HEAD
>>    push(STASH)
>> "stash pop"
>>    STASH = pop()
>>    WTREE = HEAD + (STASH-STASH^^)
>>    NEXT = HEAD + (STASH^-STASH^^)
>
> ???

"stash save" makes two new consecutive commits: one equal to NEXT and
another equal to WTREE.  (This is "STASH" above, with my
unoptimizations.)  I don't know where the SHA of the final commit gets
stored, so I just created push() and pop() commands.

Rereading the man page, the commit containing WTREE has two parents.
This notation doesn't have a way to denote that.

> [...]
>> "cherry-pick FOO" (prequires WTREE==NEXT==HEAD)
>>    HEAD = new(HEAD + (FOO - FOO^))
>>    NEXT = HEAD
>>    WTREE = HEAD
>> "rebase FOO" is basically a iterated application of "cherry-pick"
>
> Ordinary rebase isn't.
>
> --
> Jakub Narebski
> Poland
> ShadeHawk on #git
>

Jakub, thanks again for taking the time to respond.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html