Re: git-index-pack really does suck..

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Tue, 3 Apr 2007, Chris Lee wrote:
>
> git-index-pack --paranoid --stdin --fix-thin paranoid.pack < 
> 5.28s user 0.24s system 98% cpu 5.592 total
> 
> git-index-pack --stdin --fix-thin trusting.pack < 
> 5.07s user 0.12s system 99% cpu 5.202 total

Ok, that's not a big enough of a difference to care.

> So, in my case, at least... not really much of a difference, which is
> puzzling.

It's entirely possible that the object lookup is good enough to not be a 
problem even for huge packs, and it really only gets to be a problem when 
you actually unpack all the objects.

In that case, the only real case to worry about is indeed the "alternates" 
case (or if people actually use a shared git object directory, but I don't 
think anybody really does - alternates just work well enough, and shared 
object directories are painful enough that I doubt anybody *really* uses 
it).

> I also mailed out the DVD with the repo on it to hpa today, so
> hopefully by tomorrow he'll get it. (He's not even two cities over,
> and I suspect I could have just driven it to his place, but that might
> have been a little awkward since I've never met him.)

Heh. Ok, good. I'll torrent it or something when it's up.

> Anyway, so, hopefully once he gets it he can put it up somewhere that
> you guys can grab it. For reference, the KDE repo is pretty big, but a
> "real" conversion of the repo would be bigger; the one that I've been
> playing with only has the KDE svn trunk, and only the first 409k
> revisions - there are, as of right now, over 650k revisions in KDE's
> svn repo. So, realistically speaking, a fully-converted KDE git repo
> would probably take up at least 6GB, packed, if not more. Subproject
> support would probably be *really* helpful to mitigate that.

Sure. I think subproject support is likely the big "missing feature" of 
git right now. The rest is "details", even if they can be big and involved 
details.

But even at only 409k revisions, it's still going to be an order of 
magnitude bigger than what the kernel is, exactly *because* it's such a 
disaster from a maintenance setup standpoint, and it's going to be a 
useful real-world test-case. So whether that is a "good" git archive or 
not, it's going to be useful.

Long ago we used to be able to look at the historic Linux archive as an 
example of a "big" archive, but it's not actually all that much bigger 
than the normal Linux archive any more, and we've pretty much fixed the 
problems we used to have.

[ The historical pack-file is actually smaller, but that's because it was 
  done with a much deeper delta-chain to make it small: the historical 
  archive still has more objects in it than the current active git kernel 
  tree - but it's only in the 20% range, not "20 *times* bigger" ]

The Eclipse tree was useful (and I think we already improved performance 
for you thanks to working with it - I don't know how much faster the 
delta-base cache made things for you, but I'd assume it was at *least* by 
the factor-of-2.5 that we saw on Eclipse), but the KDE is bigger *and* 
deeper (the eclipse tree is 1.7GB, and 136k revisions in the main branch, 
so the KDE tree is more than twice the revisions).

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]