Horrible re-packing?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio, Nico,
 I just tried doing a "git repack -a -d -f" to because I expected a full 
re-pack to do _better_ than doing occasional incrementals, and verify the 
pack generation, but imagine my shock when IT SUCKS.

I didn't look at where the suckage started, but look at this:

	[torvalds@g5 git]$ git repack -a -d
	Generating pack...
	Done counting 21322 objects.
	Deltifying 21322 objects.
	 100% (21322/21322) done
	Writing 21322 objects.
	 100% (21322/21322) done
	Total 21322, written 21322 (delta 14489), reused 21319 (delta 14486)
	Pack pack-fe4ff117c9959ead3443b826a777423b3062b666 created.

	[torvalds@g5 git]$ ll .git/objects/pack/
	total 7008
	-rw-r--r-- 1 torvalds torvalds  512792 Jun  5 09:41 pack-fe4ff117c9959ead3443b826a777423b3062b666.idx
	-rw-r--r-- 1 torvalds torvalds 6643695 Jun  5 09:41 pack-fe4ff117c9959ead3443b826a777423b3062b666.pack

Ie, we have  anice 6.33MB pack-file.

Now:

	[torvalds@g5 git]$ git repack -a -d -f
	Generating pack...
	Done counting 21322 objects.
	Deltifying 21322 objects.
	 100% (21322/21322) done
	Writing 21322 objects.
	 100% (21322/21322) done
	Total 21322, written 21322 (delta 10187), reused 6777 (delta 0)
	Pack pack-fe4ff117c9959ead3443b826a777423b3062b666 created.

	[torvalds@g5 git]$ ll .git/objects/pack/
	total 15352
	-rw-r--r-- 1 torvalds torvalds   512792 Jun  5 09:41 pack-fe4ff117c9959ead3443b826a777423b3062b666.idx
	-rw-r--r-- 1 torvalds torvalds 15176139 Jun  5 09:41 pack-fe4ff117c9959ead3443b826a777423b3062b666.pack

Whaah! That nice 6.33MB pack-file exploded to 14.5MB!

Doing repeated "git repack -a -d" to try to do incrementals, it stopped 
improving after the sixth one, at which point it was down to 11.7MB, still 
almost twice as big as before.

Re-doing it with 

	git repack -a -d -f --depth=100 --window=100

got me back to 6.94MB, but that's still 10% larger than the pack-file I 
had before.

Interestingly, it's the "window" that matters more. The depth part didn't 
make that huge of a difference, so it looks like it's the sorting 
heuristic that may be broken again.

And it's possibly broken by the fact that we've been renaming things 
lately (ie the "rev-list.c" -> "builtin-rev-list.c" thing ends up not 
finding things)

Nico? Any ideas?

			Linus
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]