Re: [PATCH v2] git checkout -b: unparent the new branch with -o

Erick Mattos <erick.mattos@xxxxxxxxx> · Fri, 12 Mar 2010 13:45:17 -0300

Hi,

Thanks for answering.
I had to write a little in reply so I beg for you patience to read it.

This is a new option and before we start talking about the
implementation we should agree about its usefulness and the subjects
addressed by it.

This is something that probably you haven't needed so I understand
your resistance.  In a intense merging software development work flow
it would be quite rare to have a merge of an unrelated branch.

But not all of the software developments happen under this work flow.
There is a lot of very small projects.  Being developed by a few
people or even by one person.  And those softwares count too.  They
could become very important for everyone some day.

For those it is very common to:

* Do some parallel development.
* Working under unrelated parts before merging the start point of the
new software.
* Recreate history.  It is not published yet so it is not being widely
cloned and people just
  want to have a good repository before publishing;
* Adding files which is not meant to be part of anything in a temporary fashion;
* Have different approaches starting different sources from those
software versions until you
  find out the best one.

Git is becoming much more used than it was planned to be.  It is even
being used nowadays for other exotic jobs like regular file
versioning.

Those needs are nothing new.  If you query Google for "new empty
branch git" you will find that people have been querying it for.

There are a lot of ways to have the unrelated branch: you could start
a new repository and import from it; you could foresee that you will
need something like that and start all development with an empty
commit, ...

But the proposed solution is the best approach.  And it is the one I
use when needed.

This new option is not a new solution.  Just a direct implementation
so people will not need to know this hack.  See it:

Git SCM Wiki:
http://git.wiki.kernel.org/index.php/GitTips#How_to_create_a_new_branch_that_has_no_ancestor

Git Book:
http://book.git-scm.com/5_creating_new_empty_branches.html

After all said let's face the implementation.

2010/3/12 Junio C Hamano <gitster@xxxxxxxxx>
>
> Erick Mattos <erick.mattos@xxxxxxxxx> writes:
>
> > @@ -25,6 +26,10 @@ linkgit:git-branch[1] were called; in this case you can
> >  use the --track or --no-track options, which will be passed to `git
> >  branch`.  As a convenience, --track without `-b` implies branch
> >  creation; see the description of --track below.
> > +When using -b, it is possible to use the option -o to set the new branch
> > +as unparented thus unrelated to the previous branch.  The new code will
> > +be committed by using 'git add' and 'git commit' as if it was an initial
> > +commit.
>
> This says what the option does, but it is hard to guess why it would be a
> good thing to do in the first place from the above description.

So you would like to have some examples on it?
I was trying to let people know how to use it not why to use it.

> The use case in your commit log message wasn't convincing either.  If such
> a new branch will be merged later, especially if the trees of the commits
> in newly rooted history resemble the trees in the original history (I am
> guessing that is the intended use case, as you do not seem to be removing
> anything from the working tree---how is the user expected to use them by
> the way?), not having a common merge base will make the merge harder, not
> easier, and later examination of the history (think "bisect") also
> harder.

Are you talking about the commit log message of the previous version
patch?  If that is the case I replied to your message explaining it
better (but not trying to convince you).

I am not wiping the tree by default because I am not deciding for
people if they are going to use anything from it as a template (even
the directory structure only).

I am not trying to make decisions for the user.  I think he would be
capable of deciding it himself.  That is my way of thinking so I
normally prefer to advice, alert, inform not to impose.

We should remember git users are normally programers not some wind... lamers.

The intended uses of this option was explained up there.

As it is a new fresh development branch it is not normally expected to
have anything in common with original branch so the future merge is
going to be very easy.  Anything exotic is up to the user to solve.

> This looks like a "because we can" feeping creaturism, without any
> "because it is beneficial if users can do this for this reason"
> justification.
>
> And what it can do "because we can" doesn't look very useful, safe, nor
> sane either, with this particular implementation.

Benefits explained up there already.

It is not creaturism.  People need this option at various levels.

> > @@ -509,8 +510,13 @@ static void update_refs_for_switch(struct checkout_opts *opts,
> >       struct strbuf msg = STRBUF_INIT;
> >       const char *old_desc;
> >       if (opts->new_branch) {
> > -             create_branch(old->name, opts->new_branch, new->name, 0,
> > -                           opts->new_branch_log, opts->track);
> > +             if (opts->new_branch_orphan) {
> > +                     discard_cache();
> > +                     remove_path(get_index_file());
>
> I don't think we want to see "remove_path()" here.  The function is about
> the files in the work tree, and not about the files under .git/.
> Currently the codepath to create and write out the index is abstracted
> like this:
>
>    fd = hold_locked_index();
>    ... populate the_index structure ...
>    write_cache(fd);
>    commit_locked_index();
>
> and by only changing the implementation of these three functions, we could
> store the index somewhere other than on the filesystem (say, database or
> memcache).  By using remove_path() on the return value of get_index_file()
> in a random codepath like this one, you are pre-seeding a bug for other
> people who may ant to make changes like that in the future.

I see your point but what we will be doing is just wiping the index
out.  So I did it in a direct fashion.

Please let me know after you read this whole reply if you still
consider this way as inadequate.

> But the above is just an advice on the coding, assuming that what is being
> coded is sane, which unfortunately is not.  You are nuking the index but
> without doing anything to the working tree files.  Why?  The user manually
> has to remove them?  Or re-add them?  I don't think you would want to call
> discard_cache() _nor_ remove the index file here.  This is probably even
> more so if you think about a case where the user is using a sparse
> checkout.  Some random set of files are still in the working tree but
> other files aren't, and the index used to keep track of which is what, but
> you lost that information by discarding the cache.

I have already replied that but let me state it: we will be leting
people start an unrelated branch, unrelated work, fresh new stuff.
And leting them do anything they want with the files in the work tree.
 Deleting them as easy as 'git clean -df'.

>
> If you are going to leave the files in the working tree intact, you should
> make it the user's responsibility to run "git rm -f" after "checkout -o",
> if the user wants to start from an empty index.  That would also make it
> safer; a mistaken "checkout -o" would be easier to recover from by running
> reset without --hard, if it does not touch the index.

It is already very safe and easy to recover by just doing a 'git
checkout PREVIOUS_BRANCH'.

>
> By the way, the worst part of this patch is that I didn't see any safety
> checks tested in the test script.  What prevents the users from typing an
> extra -o by mistake, while having some changes to the index and/or the
> files in the working tree?  Even if you change it not to touch the index,
> it would probably make sense to make sure this "feature" is a lot harder
> to invoke by mistake.  In a sane workflow you wouldn't be creating root
> commits left and right.  Perhaps by not giving it a '-o' shorthand would
> be a good start.

I don't quite see your point here.  Not letting people use it when
they don't want to use it?!

Like protecting him from typing a letter 'o' at keyboard when creating
a branch with -b and pressing enter after this mistake?!

I really have not get it.  Could you please explain it from what we
have to protect him?

>
> In the above I assumed (by guessing from the fact that you are not
> touching files in the working tree) that the tree eventually committed by
> the user on this newly rooted history would have some resemblance to the
> trees in the original history.  But if -o is intended to be used for
> really "starting from void" with an empty tree, then I think the option
> should instead do this:

The user could probably use something from the work tree as template
so I am not making assuptions.  Just leting him know that with a 'git
clean -df'  everything would be vanished.

>
> This is just style, but I find it easier to read if you have one SP before
> redirect and no SP between redirect and the filename, i.e.:
>
>        echo initial >"$TEST_FILE" &&

You are the one that must demand the style so I am going to change it.

>
> > +     test "alpha" = "$(git branch | sed -n "/*/s/\* //p")" &&
>
> Don't read from "git branch" in scripts; use symbolic-ref on HEAD.

All right then.

Thanks for you patience to read this email and I hope this effort will
worth something.

Regards
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html