Re: [PATCH] Add git-explode-packs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jan-Benedict Glaw <jbglaw@xxxxxxxxxx> writes:

> On Sat, 2006-03-25 22:12:46 -0800, Junio C Hamano <junkio@xxxxxxx> wrote:
>> The script seems to do what it claims to, but now why would one
>> need to use this?  In other words what's the situation one would
>> find this useful?
>
> It's possibly useful if you oftenly access old objects with
> git-cat-file or git-ls-tree.

Benchmarks?

I created two cloned repositories from git.git.  victim03
repository is fully packed with the default pack parameter of
depth and window set both to 10. victim04 repository has the
same set of objects and refs but the pack is expanded (16232
loose objects).

Now in victim03 repository, 657 blobs have depth 10 (i.e. you
need to inflate and apply delta 10 times to get to the object).
So I made the list of these "expensive to access" objects and
run this:

	$ cd victim03
	$ /usr/bin/time sh -c '
            while read sha1; do git cat-file blob $sha1;
            done >/dev/null <list
	'

3.43user 3.36system 0:07.17elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+364561minor)pagefaults 0swaps
3.51user 3.33system 0:07.10elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+364499minor)pagefaults 0swaps
3.76user 2.99system 0:07.28elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+365155minor)pagefaults 0swaps

With the same file list, in victim04 repository that has 16232
loose objects:

	$ cd victim04
	$ /usr/bin/time sh -c '
            while read sha1; do git cat-file blob $sha1;
            done >/dev/null <../victim03/list
	'

3.29user 2.98system 0:06.33elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+348786minor)pagefaults 0swaps
3.26user 2.88system 0:06.63elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+347512minor)pagefaults 0swaps
3.16user 2.98system 0:06.20elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+347489minor)pagefaults 0swaps

So you are getting slight performance gain out of this by
exploding the pack, but on the other hand you are taxing the
buffer cache quite heavily by reading the loose objects (in both
of the experiments above, I discarded numbers from the very
first run).  The size of object databases in these cases are:

        $ du -sh victim0[34]/.git/objects
        6.2M    victim03/.git/objects
        84M     victim04/.git/objects

So I am still not convinced it would be useful in general.  It
used to be that exploding everything and repacking was the only
way to clean out garbage from packs, but after "repack -a -d"
was invented by Frank Sorenson that became more convenient way.
Especially with the recent "delta reusing" pack-objects, doing
"repack -a -d" has become quite cheap, so...


-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]