Re: Performance issue: initial git clone causes massive repack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Apr 04, 2009 at 05:37:53PM -0700, Robin H. Johnson wrote:

> That causes incredibly bloat unfortunately.
> 
> I'll summarize why here for the git mailing list. Most our developers
> have the entire tree checked out, and in informal surveys, would like to
> continue to do so. There are ~13500 packages right now 

Each developer doesn't work on so many packages, right ? From my point
of view, checkin'out the entire tree is the wrong way on how to do
things.

Also, you could keep an entire tree repo assuming it's _not_
"fetch-able".

> For each package, the .git directory, assuming in a single pack,
> consumes at least 36 inodes.  Tail-packing is limited to Reiserfs3 and
> JFS, and isn't widely used other than that, so assuming 4KiB inodes,
> that's an overhead of at least 144KiB per package. Multiple by the
> number of packages, and we get an overhead of 2GiB, before we've added
> ANY content.

> Without tail packing, the Gentoo tree is presently around 520MiB (you
> can fit it into ~190MiB with tail packing). This means that
> repo-per-package would have an overhead in the range of 400%.

Don't know about the business for Gentoo, but HDD is cheap. Also, I'd
like to know how much space you will gain with the CVS to Git migration.
How bigger is a CVS repo against a Git one ?

One repo per category could be a good compromise assuming one seperate
branch per ebuild, then.

> Additionally, there's a lot of commonality between ebuilds and packages,
> and having repo-per-package means that the compression algorithms can't
> make use of it - dictionary algorithms are effective at compression for
> a reason.

Please, no. We are in the long term issues. Compression will be
efficient. It's all about the content of the files and dictionary
algorithms certainly will do a good job over the ebuilds revisions.

-- 
Nicolas Sebrecht
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux