Re: erratic behavior commit --allow-empty

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: "Angelo Borsotti" <angelo.borsotti@xxxxxxxxx>
Sent: Thursday, October 04, 2012 8:07 AM
Hi Philip and all,

let me explain in full what is the problem that I tried to solve, and
how along the way I stumbled in something that seems to me a git bug
(at least a documentation one).

There is an R&D team developing software using a workflow that is
similar to the integerator-manager one (the one described by Scott
Chacon in chapter 5 of ProGit).

This has the developers having a full copy/history of the integrators relevant branches, so that when the pull of the developers branch occurs there is a proper link to the integrators history.

Developers implement features using a local repository hosted on their
workstations, and when finished push on a server; integrators pull
from it and put all the contributions together.
Since integrators rebuild always the software after merging all
contribution, there is no need for the developers to push the
binaries. Not pushing them speeds up uploading.
In order to make life simpler and safer, scripts are provided to
perform the pushing, pulling, etc. operations. So, most of the git
commands shown below are actually run from within scripts.
The development of each feature is done in a dedicated topic branch,
and the commits done in it contain both the sources and the binaries
(to allow to recover fully a previous snapshot when a later change
broke a previous one). When pushing, there are these needs:

     1. push the sources only
2. push only the last commit of the topic branch (not the whole history)

A note on point 2: the integrators are not interested in seeing all
the commits that developers did while implementing their features.
Having all the history makes their repositories cluttered.

In order to avoid pushing all the history, orphan branches are used to
parallel the topic ones.

There are other ways to create a branch which has all the developers feature history removed, rather tha using an --orphan, which removes the integrators history as well.

When pushing, first a commit is done on the topic branch, and then a
snapshot is created in the parallel branch with the same files,
binaries removed. The general case is:

    source branch                              D'
                                                       :
    topic branch        A----B----C---D

In the picture, the developer made 4 commits, and pushed the sources
of the last one, D.
A D' is created on the source branch (the relationship with D is
indicated with a dotted line).

The disconnection of the D' source branch makes it sound like you have a second SCM system that you have to put stuff into, which is independent of the development teams git repos. I have this [hassle] at my $dayjob -one almost has to hide git from the powers-that-be.

The push script must cope with all the cases that may occur:

    1.  the general one (the one in the previous figure)
    2.  none of the commits in the topic branch with binaries (i.e. D
and D' with the same tree)
    3.  push done immediately after the first commit (A)
    4.  a push done after another

The script:

    1.  creates the source branch if it does not exist yet (git
checkout --orphan),
         otherwise makes HEAD point to it
    2.  sets a .git/info/exclude file that excludes the binaries
    3.  removes the binaries from the index (git rm)
    4.  creates a commit on the source branch
    5.  pushes it
    6.  restores the HEAD and index as they were before

The operation that caused problems was nr. 4. In all the cases
enlisted above, a git commit creates a brand new and unique commit
because either it has a parent that is different from that of any
other commit, or because its tree is different. All, except case nr 3
when there are no binaries:

    source branch         A'
                                  :
    topic branch        A

In this case the parent is the same as that of A, i.e. none, and also
the tree is the same.
True.

In order to try to force the creation of a brand
new and unique commit even when the trees are the same --allow-empty
has been used, but this did not avail because

It was --orphan, --allow-empty (a common tree), the --root commit, and scripted with both branches using the same clock tick...

git commit creates a
brand new one only when the seconds of the system clock have ticked
before it.

Some of you have suggested to create an A' that is not orphan in such
a case, which is a workaround, and some others to change the message
in it, and this is another. I choose the latter

A reasonable solution. You can also create a sentinel (--root) commit for any time that you need to create the source branch, just so it (the real source code commit) has a different parent when on source branch to that on the binaries branch.

However, personally, I'd have wanted the source branch to show real history and actually match with the integrators repo history, but no doubt local conditions & politics have their influence.

because it allows to
keep the source branch orphan in all cases. So, there are workarounds,
and the script has eventually been implemented and tested, but the
unexpected, time-dependent behavior of git commit is there and someone
could stumble on it sooner or later.

-Angelo

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]