Re: [PYRITE] Status update and call for information.

"Govind Salinas" <blix@xxxxxxxxxxxxxxxxx> · Sun, 25 May 2008 13:22:56 -0500

On Sun, May 25, 2008 at 4:23 AM, Jan Krueger <jk@xxxxx> wrote:
> Hi,
>

Hey.
> allow me to play the devil's advocate here.
>
> Your approach is to combine different concepts that are similar and can
> be used to do the same thing but in completely different ways. Wouldn't
> this actually create more confusion than keeping these concepts
> separate? Examples follow.

I think it is highly dependent on how you organize it.

>> The idea is that there should be one fairly obvious way to do
>> something. If you have that then there is less confusion, especially
>> when someone asks for help.  Plus, if there is one fairly obvious way
>> to do something, then people will need to ask for help less often.
>
> If people didn't ask things like "how do I delete a file", I'd be more
> inclined to believe that. ;)
>
> Now, on to the fun part.
>
>> * commit = commit + push + stash + init
>>           push:  This is here because it fits the traditional notion
>> of what a commit does, which is to send a commit to the central
>>                     server.  I think of it as "I am committing my
>> changes to the remote repository.
>
> The problem I see there is that it will be difficult to include the
> part of push that sends several commits at once. It is a very common
> workflow to create a series of local commits, test them, possibly
> rewrite them in several ways, and finally push the entire set. To have
> your combined command do that, you'd need something like "pyt commit
> --to-remote --use-new-local-commits". Is that better than "pyt
> push" (and does describing this as "committing" actually make sense)? I
> think not. It's sacrificing convenience and sense for reducing the
> visible number of commands.

I have responded to most of this in other mails, but I think it bears
repeating.  In most cases, when combining commands, it should be
possible to do so without adding a ton of flags.  Otherwise, you are correct,
it would not help anything.  In some cases it does make sense to add a flag
to mimic another command.

Take cherry/cherry-pick for example.  These commands are related even
though they don't do the same thing.  Because they are useful in combination
it makes some sense (to me) that I would ask the cherry command both for
"what is pickable" and to actually do the picking.  Does that make sense?
Of course you would do this with a flag.

But onto the current example.  Try and the command working like this...

pyt ci [what to checkin] [where to checkin to]

[what to commit] defaults to the changes in your working directory.  It can
also be a subset of these.  In this case [where to checkin to] would most
likely be the local branch.  Whether it is allowed to checkin directly to a
remote repository is not something I am convinced about one way or the
other.

Now if you look at at that it is possible to write.

pyt ci branch:foo remote:origin:foo

No flags are needed, and it could probably be simplified to

pyt ci foo remote:origin:foo

or even

pyt ci foo origin:foo

depending on how much mind reading we want to do.  Personally
I see that as just as simple as the git equivalent without the need of
an added command.

>>           stash:  What is stash but a temporary commit (not on the
>> branch)?
>
> This is correct, but only technically. In fact, a commit is something
> that you'll typically share with others, whereas a stash is not. This
> makes me doubt it's helpful to combine both.

Again let us think of this as

pyt ci [what to checkin] [where to checkin to]

if I then say

pyt ci stash:[name]

I am saying [what to checkin] is the default (changes in the working set)
and [where to checkin to] is the stash.  This, again makes sense to me
and you would later

pyt co stash:<name>

to get it back.

It seems to be nicely symmetric.

>> * checkout = checkout + clone + branch + remote
>
> Checkout already has a doubtful duality: it can either switch branches
> or check out a specific version of a single file. I don't think
> capitalizing on the 'switch branches' concept while keeping the other
> function is a good idea. At the very least, consider splitting "fetch
> other version of this file" into a separate command.

You could be right about that, the checkout files functionality might
fit better in the "recover" command.  It makes language sense as
well, if you catch my meaning.  Meaning that when I checkout a file
I could say to myself "i want to recover the version of the file from
X commit."

> I can see another source of confusion here: with this, checkout can
> either create a new repo or a new branch in the same repo. In other
> words, what it does depends on the context you call it in. This is a
> no-no in interface design.

Actually, I was suggesting that "checkin" would incorporate the init
command, not "checkout."  The idea being that you can just skip
the init step.  I don't see this as being much different from
"git clone" which could be seen as short for

git init && git pull <uri>

except now it would be

git init && git commit ...

> Finally, "remote" could just as well go with pull or push (which is
> what it's actually used for in practice). The act of defining or
> removing a remote is misplaced here since it has nothing to do with
> checking anything out.

Please look at the following example and tell me if it makes more
sense.

Again, we define checkout as...

pyt co [what do i want to checkout] [where do i want to checkout to]

so if I say

pyt co git://foo.com/bar.git mybar

this would translate to

git remote add -f mybar git://foo.com/bar.git
git fetch mybar

I have checked git://foo.com/bar.git to refs/remotes/mybar and as a
convenience I have set up tracking for you.

Any pull/fetch/merge done after this could take advantage of the
remote having been set up.

>>   gc = clean + gc + prune + repack
>
> See Jakub's comment about that. I strongly agree with him.

Er, ok.  I am actually not sure that Jakub and I disagree on this point.
He says prune and repack are already absorbed by gc and he says
that clean could be seen as a special case of gc.  Perhaps I
misunderstood him?

>> * pull = pull, fetch, merge
>
> Unlike what Jakub says, I can imagine this working well in the
> distributed case. It could be a command that does both fetch and merge
> by default and you can switch off either. It would make little sense
> when merging a number of local branches, however.
>
>>   revert = reset + reflog
>
> Keep in mind that you need to stick git-revert somewhere, too.
>
>> (still revert)
>>           I was thinking of calling this command "recover" instead of
>>           revert, which I still think might describe what I want to do
>>           and might tell you why I think that reflog is something to
>>           combine here.  "revert --what-can-i-revert-to" would show
>>           the output of reflog.
>
> Another source of confusion, since I almost never use the reflog for
> git-reset. I almost always use a commit ID or something like HEAD^, or
> no argument at all (mostly for reset --hard).

I have decided on --show-reflog to try and reduce any potential confusion.
Does that sound better?  Please look at the help from the following
file and let me know if you think that is still confusing.

http://gitorious.org/projects/pyrite/repos/blixs-clone/blobs/wip/pyrite/standard/recover.py

All that aside, how does this sound?  A revert/recover command that
does the following, it can do a "git reset --hard" to revert your current
changes.  If you want to revert just one or more files then it would
evaluate out to be "git checkout HEAD -- <files>..."  But it could
recover to a previous state by giving it a commit id.  Then have a
separate reverse-commit command that does what git-revert does.
Part of the reason I think this is useful is that "revert" means different
things to different people, but using "recover" and "reverse-commit"
make more sense.  Also, it avoids the confusion you were talking about
earlier with checkout being too overloaded with the file stuff and the
branch stuff.

>> * track = add/addremove
>
> That would only make sense if you hide the index completely. I think
> that's a bad idea, because the index is a really powerful thing. At the
> very least, add -i gets impossible if there's no way of influencing the
> index directly.

I like having my cake and eating it too.  I intend to hide the index AND
let the user take advantage of it.  I intend to do this by postponing
its use until commit time, at which point the user would have the
option of being --picky about what they want to commit.  At that point
I would do something like "git add -i" or "git add -p" to let them
choose.  This lets them do partial commits while never having to
think about whether they have staged the right information (in the
normal case).

> It would be a lot better to keep commit -a (and perhaps hint at it if
> commit is called with an unchanged index) and define something like the
> following:
>
> * stage (or record, take, use) -- same as git-add.
> * remove (rm) -- same as git-rm. Probably good to rename --cached.
> * unstage (or unrecord) -- revert index to version in last commit.

I don't quite follow, perhaps an example would make things clearer?

[regarding revision number]
>> They are useful for a different purpose.  If I say master~5 today it
>> probably won't yield the same commit tomorrow.  while 6450:master
>> would.
>
> That's right, but keep in mind that revision numbers will probably make
> people think revision numbers are global, e.g. "hey Bill, check out
> revision 3488 I committed today" (and Bill gets it as 3754). The
> background of either CVS or SVN would encourage this.

Please read my other mails, I address this a couple times.  Other
DVCSs make great use of revision numbers without the property you
talk about.  They understand that these numbers aren't necessarily
portable like that.  However, once a commit is in a branch, it will
always have the same position in *that* branch.

> Also, 6450:master, as a syntax, doesn't cut it. What if in the past of
> master, a merge commit with seven parents happened? It's impossible to
> figure out which parent to follow. You'd have to write something like
> 45;1:357,4:774 (go back 45 commits, take parent 1, go back 357 commits,
> etc.). Just numbering all commits in master's past according to
> depth-first search, on the other hand, will just make revision numbers
> very confusing.

This is just the current git syntax back to front.  If you read my mail you
see that I define it according to --topo-order which makes a consistent
ordering of the parents.

> Something else to consider is that revision numbers are hardly better
> to remember than (abbreviated) commit IDs. Consider KDE's SVN repository
> with (currently) 812270 revisions...

Heh, like I said, *I* don't need them, *I* like sha1s just fine.  For whatever
reason, other people are scared of the sha1s, so lets give them some
training wheels until they are more comfortable.

>> I just put this in because I have seen several people ask for it on
>> the mailing list recently.  I thought to myself, that they COULD have
>> it if they really wanted.
>
> But with a number of important disadvantages. Sometimes it really is
> better not to have everything you want.
>
> In conclusion, your goal is a good one, but it's something that
> requires a lot of very careful consideration. To name just one thing,
> you need to make it consistent, clean, and still powerful enough to not
> stand in the way of moderately (un)common tasks.

I agree completely.  This thread has been good on many levels, I has
given me a lot to think about and I have refined my ideas quite a bit
and got some very helpful suggestions.  All the things you state
here are goals of mine.

> I believe there is a bit of a tendency in your approach to emulate
> commands of other VCS, and that's not the right way to go if you ask me.
> If I did something like pyt (and I won't lie, I have put some thought
> into it, including writing down a couple of ideas), I would start with a
> clean slate and design something that really makes sense (which neither
> the current git nor any other existing interface stacked on top of it
> can really deliver). David Roundy did this well for darcs, I think: a
> great number of commands have different names than in the classic VCS,
> but they all make a lot of sense. Still, again, darcs's commands
> wouldn't work that well for git; both systems are just too different.

To be honest, my creativity is limited.  I am good at taking an existing
idea and making a new (possibly improved) application/implementation
of it.  If you have *new* ideas about this I will be happy to hear them.

> Something else that's worth considering is that an interface to git is
> not just about reshuffling commands; it's also about behaviour. For
> example, submodules as they currently are are a bit hard to use
> correctly (from what I've read on IRC; I haven't used them myself yet),
> and refspecs are rather non-intuitive to use (especially push :foo).
>
> It'll be interesting to see how the various existing alternative
> interfaces to git will address all these problems.

I really haven't thought about submodules yet.  If you have some
interface ideas I would love to hear them.  One think I would like
from submodules is for them to remember where they were on a
specific branch/commit.  So if I check out X, the submodules are
checked out to wherever they were when X was checked in.  I
don't use submodules (although I plan to) but I hear that is a
problem.

Thanks for the input.

-Govind
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html