Re: How to deal with historic tar-balls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 01.01.2012 01:27 schrieb Tomas Carnecky:
> On 12/31/11 8:04 PM, nn6eumtr wrote:
>> I have a number of older projects that I want to bring into a git
>> repository. They predate a lot of the popular scm systems, so they are
>> primarily a collection of tarballs today.
>>
>> I'm fairly new to git so I have a couple questions related to this:
>>
>> - What is the best approach for bringing them in? Do I just create a
>> repository, then unpack the files, commit them, clean out the
>> directory unpack the next tarball, and repeat until everything is loaded?
>>
>> - Do I need to pay special attention to files that are renamed/removed
>> from version to version?
>>
>> - If the timestamps change on a file but the actual content does not,
>> will git treat it as a non-change once it realizes the content hasn't
>> changed?
>>
>> - Last, if after loading the repository I find another version of the
>> files that predates those I've loaded, or are intermediate between two
>> commits I've already loaded, is there a way to go say that commit B is
>> actually the ancestor of commit C? (i.e. a->c becomes a->b->c if you
>> were to visualize the commit timeline or do diffs) Or do I just reload
>> the tarballs in order to achieve this?
> 
> There is a script which will import sources from multiple tarballs,
> creating a commit with the contents of each tarball. It's in the git
> repository under contrib/fast-import/import-tars.perl.
> 
> tom

@tom: True. I didn't know about that script, but it should work.

@nn6eumtr: Basically your workflow is perfect. But let me give you some
explanation:

git init
foreach archive in *.tar; do
    tar xf $archive
    git add --all .
    git commit -m "Added $archive"
    # now remove everything except for the .git directory
    # with regular shell commands (rm -rf *). Also remove
    # any dot-files (and the tarball itself, if it's in the
    # current directory).
done

Notice the '--all' switch to 'git add': Normally, 'git add .' adds all
files that match the given pattern '.', i.e. all files in the current
directory (and below, it's recursive). The '--all' switch together with
the pattern '.' adds or updates all files already known to git *AND*
adds the files not yet known *AND* removes the files that are no longer
in the working tree. That's exactly what you want.

Consider archive1.tar with files A, B, C:

  git add --all . # will add A, B, and C

Now remove A, B, C, and unpack archive2.tar. Assume it has files B, C,
D. A was deleted, B was changed, C is unchanged, D is new.

  git add --all . # will remove A, add B, leave C, add D.

git will notice that C hasn't changed its content (timestamp doesn't
matter).

Without the '--all' switch, git would simply add B and D.

There is no problem re-arranging the history after your import (see "git
rebase --help", especially the --interactive section), but then you
probably will have conflicts and have to resolve them. I'd suggest to
re-start the import instead.

Please note that "for archive in *.tar" will pick the tarballs in
lexicographical order. That might not be your intention.

HTH,
    Dirk

















--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]