Re: Git hosting techniques

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



  Hi,

  cc'ing git@xxxxxxxxxxxxxxx since this might be interesting for other
Git people as well.

On Sun, Oct 29, 2006 at 06:54:46PM CET, Sylvain Beucler wrote:
> We're currently setting up something similar at
> http://cvs.sv.gnu.org/gitweb/,

  That's great!

> I would like to know if you considered the ability to autopack
> repositories to optimize space and disk i/o. For example, we're
> experimenting with the coreutils repository which weighs 1.1GB. Since
> you mirror the glibc repository, maybe you have similar issues?

  currently I do it in a rather silly way and when I do an "all-repo
check" every hour (which updates mirrors of external repositories etc.)
- I also check for unpacked objects and if there are any, I will repack
the repository; see

	http://repo.or.cz/w/repo.git?a=blob;f=updatecheck.sh;hb=HEAD

  This is not an optimal behaviour, for two reasons:

  (i) Full repack can be a lot of work on large repositories, so we
shouldn't *always* repack but more importantly, we should only rarely do
a full repack - see below.

  (ii) This is very unfriendly to those who fetch over HTTP, because
after you do a full repack, they will need to download the whole new
packfile instead of just the missing objects.

  The best solution would be to have a more intelligent repacking
strategy, where you have "archival" packs with very old history and an
active pack with just the new changes, and when you pack the loose
objects they just get appended to the "current" pack. Alternatively,
a slightly more complicated but even more flexible "logarithmic"
repacking strategy could be implemented, see

	http://news.gmane.org/find-root.php?message_id=<20051112135947.GC30496@xxxxxxxxxxx>

  Even with the dumb packing strategy though, I think it pays off if you
have at least a bit of CPU power to spare. The packing saving are
really immense. For example with the glibc repository, an incremental
CVS import worth of few days of changes _doubled_ the size of the
repository (from 100M to 200M), while repacking brought it back to the
original size (100M) + epsilon.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]