Re: large(25G) repository in git

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 23, 2009, Adam Heath wrote:
> We maintain a website in git.  This website has a bunch of backend
> server code, and a bunch of data files.  Alot of these files are full
> videos.
> 
> [...]
> 
> Last friday, I was doing a checkin on the production server, and found
> 1.6G of new files.  git was quite able at committing that.  However,
> pushing was problematic.  I was pushing over ssh; so, a new ssh
> connection was open to the preview server.  After doing so, git tried
> to create a new pack file.  This took *ages*, and the ssh connection
> died.  So did git, when it finally got done with the new pack, and
> discovered the ssh connection was gone.

   As stated several times by Linus and others, Git was not designed
to handle large files. My stance on the issue is that before trying
to optimise operations so that they perform well on large files, too,
Git should usually avoid such operations, especially deltification.
One notable exception would be someone storing their mailbox in Git,
where deltification is a major space saver. But usually, these large
files are binary blobs that do not benefit from delta search (or even
compression).

   Since I also need to handle large files (80 GiB repository), I am
cleaning up some fixes I did, which can be seen in the git-bigfiles
project (http://caca.zoy.org/wiki/git-bigfiles). I have not yet tried
to change git-push (because I submit through git-p4), but I hope to
address it, too. As time goes I believe some of them could make it into
mainstream Git.

   In your particular case, I would suggest setting pack.packSizeLimit
to something lower. This would reduce the time spent generating a new
pack file if the problem were to happen again.

Regards,
-- 
Sam.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux