Re: How to deal with historic tar-balls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/31/2011 1:04 PM, nn6eumtr wrote:
I have a number of older projects that I want to bring into a git
repository. They predate a lot of the popular scm systems, so they
are primarily a collection of tarballs today.

I'm fairly new to git so I have a couple questions related to this:

- What is the best approach for bringing them in? Do I just create a
 repository, then unpack the files, commit them, clean out the
directory unpack the next tarball, and repeat until everything is
loaded?

- Do I need to pay special attention to files that are
renamed/removed from version to version?

- If the timestamps change on a file but the actual content does not,
 will git treat it as a non-change once it realizes the content
hasn't changed?

- Last, if after loading the repository I find another version of the
 files that predates those I've loaded, or are intermediate between
two commits I've already loaded, is there a way to go say that commit
B is actually the ancestor of commit C? (i.e. a->c becomes a->b->c if
you were to visualize the commit timeline or do diffs) Or do I just
reload the tarballs in order to achieve this?

The git-rm manpage contains instructions under the "vendor code drop"
section on how to do this.  I imagine you will want to do each one
manually instead of queueing them up in a script because you are likely going to want to do appropriate clean up of the working tree in each iteration before committing. This is where you would review renames/removes with git-status before you git-add and git-commit. Also, if you are tracking permissions in git (the executable bit) then you will want to filter out any noise generated by frivolous permissions changes between the tarball contents.

In regard to inserting tarballs into the history that depends on when you think you plan on doing that. You are only going to be able to do that before the history is published (made "public" for other repos to pull down). Otherwise you will be rewriting published history which is a big no-no (see git-rebase manpage). I suggest you do your homework and order them properly before you start because that will be less work. If you still find that you missed something then you can use interactive git-rebase to insert. I'm assuming a single "master" branch with linear history is your desired end result. If you want to create maintenance branches showing release history then you will definitely need to do your homework first (see gitworkflow manpage).

If you venture into rebase territory by rewriting history (inserting missed tarballs in between older commits) you will need to be sure to review your automatic merge resolutions. Git only generates merge-conflicts on same-file-same-line conflicts. It will auto-merge same-file-different-line changes.

You also need to ask yourself if you really need a history of all those versions. To exaggerate, if all you really need is the current state then you need to ask yourself if it's worth the effort to record the previous states. Maybe what you want is something in-between (a happy medium).

In regard to the 'start-over' method of inserting missed tarballs you would just git-reset --hard to the commit you want to insert on-top-of, add the tarball, and then re-apply the subsequent tarballs. If you are doing cleanup between commits then the rebase or cherry-pick of the already cleaned-up subsequent commits from the "old-branch" (previous attempt) onto the 'do-over' branch will likely be easier. (You can just do 'git branch old-branch' on your branch before the git-reset --hard (do-over) and that will give you a "backup copy" of the "previous attempt" called "old-branch" that you can salvage already-done-work from by using rebase or cherry-pick.)

Hope this helps.

v/r,
neal
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]