Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Liu Yubao wrote:
Thanks to Junio for his patient explanation about branches in git, I find there is a subtle difference between GIT and regular VCS that can be easily
neglected by newbies.

I realize that git is a *content tracker*, it only creates commit object
when the corresponding tree is really modified, git records content merging
but not usual merging operation, that's why git is called a content tracker.
This explains why a merging that is really a fast forwarding doesn't create
any new commit.

This feature is different from many regular VCS like CVS and Subversion and
confuses newbies that come from them: mainline doesn't make sense too much,
'git log' shows many logs from other branches. In git, a branch is almost a
tag, you can't get the *track* of a branch(It's a pity reflog is only for
local purpose). I am used to one-trunk-and-more-side-branches way, every
branches are isolated clearly, git makes me very confused at the beginning.


Then, what bad *logical* problem will happen if a merging that is really a fast forwarding creates a new commit?


If "fake" commits (i.e., commits that doesn't change any content) are introduced for each merge, it will change the ancestry graph and the resulting tree(s) won't be mergable with the tree it merged with, because each such "back-merge" would result in
* the "fake" commit becoming part of history
* a new "fake" commit being introduced

Consider what happens when Alice pulls in Bob's changes. The merge-base of Bob's tip is where Alice HEAD points to, so it results in a fast-forward, like below.

a---b---c---d               <--- Alice
             \
              e---f---g     <--- Bob


If, we would have created a fake commit instead, Alice would get a graph that looks like so:

a---b---c---d-----------h   <--- Alice
             \         /
              e---f---g     <--- Bob


Now, we would have two trees that are identical, because the merge can't cause conflicts, but Alice and Bob will have reached it in two different ways. When Bob decides he wants to go get the changes Alice has done, his tree will look something like this:

a---b---c---d-----------h          <--- Alice
             \         / \
              e---f---g---i        <--- Bob


He finds it odd that he's got two commits that, when checked out, lead to the exact same tree, so he asks Alice to get his tree and see what's going on. Alice will then end up with this:

a---b---c---d-----------h---j      <--- Alice
             \         / \ /
              e---f---g---i        <--- Bob


Now there's four commits that all point to identical trees, but the ancestry graphs differ between all developers. In the case above, there's only two people working at the same project. Imagine the amount of empty commits you'd get in a larger project, like the Linux kernel.

Fast-forward is a Good Thing and the only sensible thing to do in a system designed to be fully distributed (i.e., where there isn't necessarily any middle point with which everybody syncs), while scaling beyond ten developers that merge frequently between each other.

If we throw away all compatibility, efficiency, memory and disk consumption
problems,
(1) we can get the track of a branch without reflog because HEAD^1 is
always the tip of target branch(or working branch usually) before merging.

(2) with the track, branch mechanism in git is possibly easier to understand, especially for newbies from CVS or Subversion, I really like git's light weight, simple but powerful design and great efficiency, but I am really surprised that 'git log' shows logs from other branches and a side branch can become part of main line suddenly.

A revision graph represents fast forwarding style merging like this:

            (fast forwarding)
 ---- a ............ * ------> master
       \            /
        b----------c -----> test         (three commits with three trees)

can be changed to:

 ---- a (tree_1) ----------- d (tree_3) ------> master
       \                    /
        b (tree_2) ------- c (tree_3) ----> test
(four commits with three trees, it's normal as more than one way can reach Rome :-)


That's where our views differ. In my eyes, "d" and "c" are exactly identical, and I'd be very surprised if the scm tried to tell me that they aren't, by not giving them the same revid.

--
Andreas Ericsson                   andreas.ericsson@xxxxxx
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]