On Sun, Jul 08, 2007 at 10:31:43PM -0700, Junio C Hamano wrote: > Putting aside a potential argument that the way the file in > question, version.lisp-expr, is kept track of might be insane, > this is an interesting topic. Yeah, that version numbering system worked quite well for CVS, given its lack of any other kind of useful whole-tree versioning, and the fact that there wasn't much branching and merging, due to it being a pain in the ass. If an when we move to something like Git, something else will have to be done, as that file will /always/ be in conflict. > In addition to the above stats, it may be interesting to know: > > - pack generation time and memory footprint (/usr/bin/time); > > I suspect you would have to try_delta more candidates, so > this may degrade a bit, but that is done for getting a better > deltification, so we would need to see if the extra cost is > within reason and worth spending. It was already try_delta'ing everything in the window. The only difference now is that create_delta may generate one more byte of delta before giving up. That doesn't seem to have affected things at all outside of sampling noise: (These timings are for the Git pack on Linux/amd64, --window and --depth both 100. Since /usr/bin/time doesn't seem to report any useful memory statistics on Linux, I also have a "ps aux" line from when the memory size looked stable. This was different from run to run but it shows the two are in the same order of magnitude.) Unpatched: 54.99user 0.18system 0:56.80elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (14major+32417minor)pagefaults 0swaps bdowning 5290 98.7 4.5 106788 92900 pts/1 R+ 01:26 0:49 git pack-obj Patched: 55.37user 0.19system 0:56.35elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+32249minor)pagefaults 0swaps bdowning 6086 100 4.5 106880 92996 pts/1 R+ 01:29 0:49 git pack-obj > - resulting pack size (ls -l pack-*.pack) > > I do not expect your change would degrade in this area, as > you are currently not trading size with shallower delta > depth. The patched version is actually smaller in both SBCL's and Git's case (again, --window 100 and --depth 100): SBCL: 61696 bytes smaller (13294225-13232529) Git: 16010 bytes smaller (12690424-12674414) I believe the reason for this is that more deltas can get in under the depth limit. If I repack the Git pack with --depth=999999999, the patched version generates a pack that is 1793 bytes smaller. (12334183-12332390) (Hmm, I was expecting that to be the same, I'm not sure why it's not. Padding?) > Regarding your patch, I think it does not look too bad, as you > never pick delta that is larger than the best-so-far in favor of > shallower depth. > > It would become worrysome (*BUT* infinitely more interesting) > once you start talking about a tradeoff between slightly larger > delta and much shorter delta. Such a tradeoff, if done right, > would make a lot of sense, but I do not offhand think of a way > to strike a proper balance between them efficiently. Yeah, I was thinking about that too, and came to the same conclusion. I suspect you'd have to save a /lot/ of delta depth to want to pay any more I/O, though. Another thing that might be iffy (and complicated) is that if you keep making a good low-depth delta off of a particular object, it might be good to promote it so it stays in the window for longer. -bcd - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html