Re: 16 gig, 350,000 file repository

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thursday, February 18, 2010 at 15:58:42 (-0500) Nicolas Pitre writes:
>On Thu, 18 Feb 2010, Bill Lear wrote:
>
>> I'm starting a new, large project and would like a quick bit of advice.
>> 
>> Bringing in a set of test cases and other files from a ClearCase
>> repository resulted in a 350,000 file git repo of about 16 gigabytes.
>> 
>> The time to clone over a fast network was about 250 minutes.  I could
>> not verify if the repo had been packed properly, etc.
>
>I'd start from there.  If you didn't do a 'git gc --aggressive' after 
>the import then it is quite likely that your repo isn't well packed.
>
>Of course you'll need a big machine to repack this.  But that should be 
>needed only once.

Ok, well they have a "big machine", but not big enough.  It's running
out of memory on the gc.  I believe they have a fair amount of memory:

% free
             total       used       free     shared    buffers     cached
Mem:      16629680   16051444     578236          0      28332   14385948
-/+ buffers/cache:    1637164   14992516
Swap:      8289500       1704    8287796

and they are using git 1.6.6.

Assuming we can figure out how to gc this puppy (is there any way on a
machine without 64 gigabytes?), there is still a question that
remains: how to organize a project that has a very large amount of
test cases (and test data) that we might not want to pull across the
wire each time.  Instead of shallow clone, as sort of slicing clone
operation?

We thought of using submodules.  That is, code (say) goes in a separate
repo 'src' and functional tests go in another, called 'ftests'.  Then,
we add 'ftests' as a submodule to 'src'.  Great.  However, we need to
be able to branch 'src' and 'ftests' together.  Example: I am working on
a new feature in a branch "GLX-473_incremental_compression".  I would like
to be able to create the branch in both the 'src' repo and the 'ftests'
repo at the same time, make changes, commit, and push to that branch for
both.  When developers check out the repo, they move to that branch, but
do NOT want the cloned ftests.  However, the QA team wants both the source
and the tests that I have checked in and pushed.

Is there an easy way to support this?


Bill
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]