Hi, On Sat, 12 Jul 2008, Stephan Hennig wrote: > Johannes Schindelin schrieb: > > On Fri, 11 Jul 2008, Andreas Ericsson wrote: > > > >> Seems like you're being bitten by a bug we had some months back, > >> where the client requested full history for new tag objects. > > > > I do not think so. I think it is a problem with the pack. The > > slowness is already there in the clone, in the resolving phase. > > Thanks for having a look at this! What does "problem with the pack" > mean? Do you think it is a Git problem (client or server side?) or just > a misconfiguration? I thought that the blobs in the pack are just too similar. That makes for a good compression, since you get many relatively small deltas. But it also makes for a lot of work to reconstruct the blobs. I suspected that you run out of space for the cache holding some reconstructed blobs (to prevent reconstructing all of them from scratch). To see what I mean, just look at $ git -p verify-pack -v \ .git/objects/pack/pack-563c2d83940c7e2d8c20a35206a390e2e567282f.pack (or whatever pack you have there). It has this: -- snip -- chain length = 40: 7 objects chain length = 41: 8 objects chain length = 42: 4 objects chain length = 43: 8 objects chain length = 44: 6 objects chain length = 45: 2 objects chain length = 46: 6 objects chain length = 47: 2 objects chain length = 48: 2 objects chain length = 49: 2 objects chain length = 50: 2 objects -- snap -- ... but that could not be the reason, as my current git.git's pack shows this: -- snip -- chain length = 40: 122 objects chain length = 41: 99 objects chain length = 42: 77 objects chain length = 43: 76 objects chain length = 44: 69 objects chain length = 45: 72 objects chain length = 46: 66 objects chain length = 47: 103 objects chain length = 48: 77 objects chain length = 49: 111 objects chain length = 50: 86 objects chain length > 50: 60 objects -- snap -- ... which is much worse. So I tried this: -- snip -- wortliste$ /usr/bin/time git index-pack -o /dev/null .git/objects/pack/pack-563c2d83940c7e2d8c20a35206a390e2e567282f.pack fatal: unable to create /dev/null: File exists Command exited with non-zero status 128 27.12user 11.21system 2:51.02elapsed 22%CPU (0avgtext+0avgdata 0maxresident)k 81848inputs+0outputs (1134major+2042348minor)pagefaults 0swaps -- snap -- Compare that to git.git: -- snip -- git$ /usr/bin/time git index-pack -o /dev/null .git/objects/pack/pack-355b54f45778b56c00099bf45369f8a4f2704a51.pack fatal: unable to create /dev/null: File exists Command exited with non-zero status 128 16.13user 0.38system 0:17.80elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k 81288inputs+0outputs (38major+51917minor)pagefaults 0swaps -- snap -- So it seems that the major faults (requiring I/O) occur substantially more often with your repository. BTW the RAM numbers here are obviously bogus, the program trashed the disk like there was no tomorrow. Okay, "valgrind --tool=massif" to the rescue: -- snip -- MB 555.9^ , # | @..# | @. :@::# | , @@: :@::# | ,@. @:. .:: @: : @@: :@::# | @: .@@::@:: :::: @: : @@: :@::# | , .@: :@@::@:: :::: @: : @@: :@::# | . .@ :@: :@@::@:: :::: @: : @@: :@::# | . :: : :: :@ :@: :@@::@:: :::: @: : @@: :@::# | . : :: : :: :@ :@: :@@::@:: :::: @: : @@: :@::# | . ,.: : : :: : :: :@ :@: :@@::@:: :::: @: : @@: :@::# : | .: @:: : : :: : :: :@ :@: :@@::@:: :::: @: : @@: :@::# : | ::: @:: : : :: : :: :@ :@: :@@::@:: :::: @: : @@: :@::# : | :::: @:: : : :: : :: :@ :@: :@@::@:: :::: @: : @@: :@::# : | . ::::: @:: : : :: : :: :@ :@: :@@::@:: :::: @: : @@: :@::# :. | .: ::::: @:: : : :: : :: :@ :@: :@@::@:: :::: @: : @@: :@::# :: | ::: ::::: @:: : : :: : :: :@ :@: :@@::@:: :::: @: : @@: :@::# :: | : ::: ::::: @:: : : :: : :: :@ :@: :@@::@:: :::: @: : @@: :@::# :: | .:: ::: ::::: @:: : : :: : :: :@ :@: :@@::@:: :::: @: : @@: :@::# :: | . :::: ::: ::::: @:: : : :: : :: :@ :@: :@@::@:: :::: @: : @@: :@::# :: 0----------------------------------------------------------------------->Gi 32.83 -- snap -- Whoa. As you can see, your puny little 3.3 megabyte pack is blown to a full 555 megabyte in RAM. That is bad. Okay, so what is the reason? You have a pretty large file there, "wortliste", weighing in with 13 megabyte. This file is part of at least one of those 50-strong delta chains. To reconstruct the blobs, we have to store all intermediate versions in RAM (since index-pack is called with "--stdin" from receive-pack, which is called by clone). Now, the file was big from the beginning, so you end up with ~13*50 megabyte (actually, even 100 megabyte less) while indexing one single delta chain. My tests were performed on a puny little laptop (512MB RAM, to be precise, as I am a strong believer that developers with too powerful machines just lose touch to reality and write programs that are only useful to themselves, but useless for everyone else), where this hurt big time. Now, I do not know the internals of index-pack enough to know if there is a way to cut the memory usage (by throwing out earlier reconstructed blobs, for example, and reconstructing them _again_ if need be), so I Cc:ed Nico and hand the problem off to him. I expect this to touch the resolve_delta() function of index-pack.c in a major way, though. Ciao, Dscho P.S.: It seems that "git verify-pack -v" only shows the sizes of the deltas. Might be interesting to some to show the unpacked _full_ size, too. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html