Re: How to deal with historic tar-balls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: "Tomas Carnecky" <tom@xxxxxxxxxxxxx> Sent: Sunday, January 01, 2012
12:27 AM
>On 12/31/11 8:04 PM, nn6eumtr wrote:
>> I have a number of older projects that I want to bring into a git
>> repository. They predate a lot of the popular scm systems, so they are
>> primarily a collection of tarballs today.
I'm doing a similar thing with a set of zip files. I grouped mine into
batches for easier checking and putting on to separate branches. Planning
your branch requirements is probably the biggest task, and will depend on
how you hope to use the new repo.

>> I'm fairly new to git so I have a couple questions related to this:
>>
>> - What is the best approach for bringing them in? Do I just create a
>> repository, then unpack the files, commit them, clean out the
>> directory unpack the next tarball, and repeat until everything is loaded?
Essentially yes; Obviously if you have an organisation in mind then you can
introduce maintenance branches etc as you develop the import.

If it is simply to create a nice history that isn't really looked at, then a
simple linear model is OK. If you need to keep old mintenance versions and
obtain diffs with newer versions then look at the various branching models
and populate appropriately.

>>
>> - Do I need to pay special attention to files that are renamed/removed
>> from version to version?
No; It is only if you use those file dates to determine the implied date for
the commit.

>>
>> - If the timestamps change on a file but the actual content does not,
>> will git treat it as a non-change once it realizes the content hasn't
>> changed?
Correct. In fact git doesn't record the time stamps anyway. It simply
records the content, and structure, of the snapshot.

>>
>> - Last, if after loading the repository I find another version of the
>> files that predates those I've loaded, or are intermediate between two
>> commits I've already loaded, is there a way to go say that commit B is
>> actually the ancestor of commit C? (i.e. a->>c becomes a->>b->>c if you
>> were to visualize the commit timeline or do diffs) Or do I just reload
>> the tarballs in order to achieve this?
You can use 'grafts' as a mechanism to re-arrange the commit order, and/or
join partial repos, and then use git filter-branch to re-write the lot as a
single cohesive repo. But this 'hack' does re-write all the commit SHA1
values, so you should minimise the number of times that happens...

It is worth capturing your import sequence as a script so that you can wash 
/ rinse / repeat as often as needed to get a result you like.

> There is a script which will import sources from multiple tarballs,
> creating a commit with the contents of each tarball. It's in the git
> repository under contrib/fast-import/import-tars.perl.
I wasn't aware of those scripts. I'll be having a look at the zip import
script for my needs.

My extra problem is that almost all my zips have an extra top level
directory that changes its name for every zip (but some don't..). The TLD
changes confuses the git rename detection if I don't remove them before
committing. Fortunately it's an internal development project with no formal
releases so creating the history is a bit of a personal project which
doesn't affect ongoing development (which is the crunch question for
fidelity of the repo you create).

> tom
Philip

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]