Again, I'm about to leave on a trip for a few days (back late Thursday), but just wanted to comment in on the thread. On Mon, Apr 06, 2009 at 12:06:00AM -0400, Nicolas Pitre wrote: > > While my current pack setup has multiple packs of not more than 100MiB > > each, that was simply for ease of resume with rsync+http tests. Even > > when I already had a single pack, with every object reachable, > > pack-objects was redoing the packing. > In that case it shouldn't have. I'll retest that part on my return, but I'm pretty sure I did see the same excess cputime usage. > > Also, I did another trace, using some other hardware, in a LAN setting, and > > noticed that git-upload-pack/pack-objects only seems to start output to the > > network after it reaches 100% in 'remote: Compressing objects:'. > That's to be expected. Delta compression matches objects which are not > in the stream order at all. Therefore it is not possible to start > outputting pack data until this pass is done. Still, this pass should > not be invoked if your repository is already fully packed into one pack. So it's seeking around the existing packs before sending? > Can you confirm this is actually the case? The most recent tests were with the 15(+ one partial) packs limited to a max of 100MiB each, because that made resume for rsync/http during the tests much cleaner. > > Relatedly, throwing more RAM (6GiB total, vs. the previous 2GiB) at > > the server in this case cut the 200 wallclock minutes before any > > sending too place down to 5 minutes. > Well... here's a wild guess. In the source repository serving clone > requests, please do: > git config pack.deltaCacheSize 1 > git config pack.deltaCacheLimit 0 > and try cloning again with a fully packed repository. I did the multiple pack case quickly, and found that it does still take a long time in the low memory case. I'll do the test with a single pack on my return. > The caching pack project is to address a different issue: mainly to > bypass the object enumeration cost. In other words, it could allow for > skipping the "Counting objects" pass, and a tiny bit more. At least in > theory that's about the main difference. This has many drawbacks as > well though. Relatedly, would it be possible to keep a cache of enumerated objects that was trivially updatable during pushes? -- Robin Hugh Johnson Gentoo Linux Developer & Infra Guy E-Mail : robbat2@xxxxxxxxxx GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85
Attachment:
pgpseJNMeb6P3.pgp
Description: PGP signature