Jan-Benedict Glaw <jbglaw@xxxxxxxxxx> writes: > On Sat, 2006-03-25 22:12:46 -0800, Junio C Hamano <junkio@xxxxxxx> wrote: >> The script seems to do what it claims to, but now why would one >> need to use this? In other words what's the situation one would >> find this useful? > > It's possibly useful if you oftenly access old objects with > git-cat-file or git-ls-tree. Benchmarks? I created two cloned repositories from git.git. victim03 repository is fully packed with the default pack parameter of depth and window set both to 10. victim04 repository has the same set of objects and refs but the pack is expanded (16232 loose objects). Now in victim03 repository, 657 blobs have depth 10 (i.e. you need to inflate and apply delta 10 times to get to the object). So I made the list of these "expensive to access" objects and run this: $ cd victim03 $ /usr/bin/time sh -c ' while read sha1; do git cat-file blob $sha1; done >/dev/null <list ' 3.43user 3.36system 0:07.17elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+364561minor)pagefaults 0swaps 3.51user 3.33system 0:07.10elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+364499minor)pagefaults 0swaps 3.76user 2.99system 0:07.28elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+365155minor)pagefaults 0swaps With the same file list, in victim04 repository that has 16232 loose objects: $ cd victim04 $ /usr/bin/time sh -c ' while read sha1; do git cat-file blob $sha1; done >/dev/null <../victim03/list ' 3.29user 2.98system 0:06.33elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+348786minor)pagefaults 0swaps 3.26user 2.88system 0:06.63elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+347512minor)pagefaults 0swaps 3.16user 2.98system 0:06.20elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+347489minor)pagefaults 0swaps So you are getting slight performance gain out of this by exploding the pack, but on the other hand you are taxing the buffer cache quite heavily by reading the loose objects (in both of the experiments above, I discarded numbers from the very first run). The size of object databases in these cases are: $ du -sh victim0[34]/.git/objects 6.2M victim03/.git/objects 84M victim04/.git/objects So I am still not convinced it would be useful in general. It used to be that exploding everything and repacking was the only way to clean out garbage from packs, but after "repack -a -d" was invented by Frank Sorenson that became more convenient way. Especially with the recent "delta reusing" pack-objects, doing "repack -a -d" has become quite cheap, so... - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html