Re: 1.3.0 creating bigger packs than 1.2.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I just spent some time bisecting this issue and it looks like the
following change by Junio may be the culprit:

  commit 1d6b38cc76c348e2477506ca9759fc241e3d0d46
  Author: Junio C Hamano <junkio@xxxxxxx>
  Date:   Wed Feb 22 22:10:24 2006 -0800
  
      pack-objects: use full pathname to help hashing with "thin" pack.
      
      This uses the same hashing algorithm to the "preferred base
      tree" objects and the incoming pathnames, to group the same
      files from different revs together, while spreading files with
      the same basename in different directories.
      
      Signed-off-by: Junio C Hamano <junkio@xxxxxxx>
  
  :100644 100644 af3bdf5d358b8a47ed23bcb7e9721e956eb59d60 3a16b7e4ce25ec05c64817dfd92dd9d517ab9dd3 M      pack-objects.c


Linus Torvalds <torvalds@xxxxxxxx> wrote:
> 
> 
> On Thu, 20 Apr 2006, Shawn Pearce wrote:
> > 
> > Oddly enough repacking the v1.2.3 pack using 1.3.0.g56c1 created an
> > even smaller pack ("git-repack -a -d"):
> 
> That's "normal". Repacking without -f will always pack _more_, never less. 
> So a different packing algorithm can only improve (of course, usually not 
> by a huge margin, and it quickly diminishes).
> 
> > but then adding -f definately gives us the 2x explosion again:
> > 
> >   Total 46391, written 46391 (delta 6649), reused 37894 (delta 0)
> >   129M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack
> 
> Right. Doing the -f will discard any old packing info, so if the new 
> packing algorithm has problems (and it obviously does), then using -f will 
> show them.
> 
> > > You could try to revert that change:
> > > 
> > > 	git revert eeef7135fed9b8784627c4c96e125241c06c65e1
> > 
> > Whoa.  I did that revert and fixup on top of 'next'.  The pack
> > from "git-repack -a -d -f" is now even larger due to even less
> > delta reuse:
> 
> Ok, so that wasn't it, and the new sort order is superior.
> 
> That means that it probably _is_ the delta changes themselves (probably 
> commit c13c6bf7 "diff-delta: bound hash list length to avoid O(m*n) 
> behavior". You can try
> 
> 	git revert c13c6bf7
> 
> to see if that's it. Although Nico already showed interest, and if you 
> make the archive available to him, he's sure to figure it out.
> 
> > With --window=50 on 'next' (without the revert'):
> > 
> >   Total 46391, written 46391 (delta 6666), reused 39723 (delta 0)
> >   129M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack
> 
> Yeah, that didn't do much. Slightly more deltas than without, but not a 
> lot, and it didn't matter much size-wise.
> 
> You can try "--depth=50" (slogan: more "hot delta on delta action"), but 
> it's looking less and less like a delta selection issue, and more and more 
> like the deltas themselves are deproved.
> 
> 			Linus

-- 
Shawn.
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]