Junio C Hamano wrote: > <trast@xxxxxxxxxxxxxxx> writes: > > > From: Thomas Rast <trast@xxxxxxxxxxxxxxx> > > > > While write-tree has code to write out the cache-tree information > > (since we have to compute it anyway if the cache is stale), commit > > lost this capability when it became a builtin and moved away from > > using write-tree. > > Earlier the code read from the index, made sure that it is not unmerged by > running cache_tere_update(), before running prepare-commit-msg hook. The > hook used to see the index that was read in this codepath which is the > same as what pre-commit left us. > > Why run an extra I/O here? The index file could be quite large, and I do > not want people to writing it out without good reason. Ok, so let's run some numbers. With the first test script below I'm seeing: before patch: $ time ./commit-in-large-tree.sh Initialized empty Git repository in /dev/shm/commit-in-large-tree.tmp/.git/ 6.9M .git/index real 1m31.607s user 0m57.604s sys 0m29.976s after patch: 14% speedup $ time ./commit-in-large-tree.sh Initialized empty Git repository in /dev/shm/commit-in-large-tree.tmp/.git/ 7.0M .git/index real 1m18.521s user 0m53.430s sys 0m22.138s On the other hand if you touch every file as in the second script: before patch: $ time ./commit-in-large-tree-2.sh Initialized empty Git repository in /dev/shm/commit-in-large-tree.tmp/.git/ 6.9M .git/index real 1m40.910s user 0m58.731s sys 0m38.011s after patch: 5% slowdown $ time ./commit-in-large-tree-2.sh Initialized empty Git repository in /dev/shm/commit-in-large-tree.tmp/.git/ 7.0M .git/index real 1m45.465s user 1m2.329s sys 0m38.849s I also ran the latter test where it only touches one file in 100 (instead of all 1000) subdirs, and there the patch is still a speedup. So I guess it depends whether we expect users to mostly modify a small part or the whole tree. Regarding your other email > When we are running a partial commit, the index file you are writing back > is a temporary index only to build a tree object to record in the commit, > which we already have done, and the temporary will be discarded. that's a valid point that I need to address. -- 8< -- commit-in-large-tree.sh #!/bin/sh set -e git init /dev/shm/commit-in-large-tree.tmp cd /dev/shm/commit-in-large-tree.tmp for i in $(seq 1 1000); do mkdir $i ( cd $i for j in $(seq 1 100); do echo $j > $j done ) git add $i done git commit -q -m initial du -h .git/index for i in $(seq 1 100); do echo "$i changed" > $i/$i git add $i/$i git commit -q -m $i done rm -rf /dev/shm/commit-in-large-tree.tmp -- >8 -- -- 8< -- commit-in-large-tree-2.sh #!/bin/sh set -e git init /dev/shm/commit-in-large-tree.tmp cd /dev/shm/commit-in-large-tree.tmp for i in $(seq 1 1000); do mkdir $i ( cd $i for j in $(seq 1 100); do echo $j > $j done ) git add $i done git commit -q -m initial du -h .git/index for i in $(seq 1 100); do for j in $(seq 1 1000); do echo "$i changed" > $j/$i done git add -u git commit -q -m $i done rm -rf /dev/shm/commit-in-large-tree.tmp -- >3 -- -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html