Benchmarking git-add vs git-ls-files+update-index (was: way to automatically add untracked files?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Kastrup <dak@xxxxxxx> writes:

> Junio C Hamano <gitster@xxxxxxxxx> writes:
>
>> Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:
>>
>>> ... (except, as we found out last week, we've had a huge 
>>> performance regression, so that's actually a really slow way to do it, and 
>>> so it's actually faster to do
>>>
>>> 	git ls-files -o | git update-index --add --stdin
>>> 	git commit -a
>>>
>>> instead...
>>
>> Is it still the case after the fix in rc4?  Other than the
>> theoretical "on multi-core, ls-files and update-index can run in
>> parallel" performance boost potential,

dak@lola:/home/tmp/texlive$ git-init
Initialized empty Git repository in .git/
dak@lola:/home/tmp/texlive$ time git --work-tree=/usr/local/texlive/2007/texmf-dist add .

real    9m36.256s
user    2m2.408s
sys     0m25.874s
dak@lola:/home/tmp/texlive$ time git --work-tree=/usr/local/texlive/2007/texmf-dist add .

real    0m34.161s
user    0m0.448s
sys     0m2.212s

[So the rc4 fix seems to have made it.]

dak@lola:/home/tmp/texlive$ rm -rf .git;git-init
Initialized empty Git repository in .git/

dak@lola:/home/tmp/texlive$ time git --work-tree=/usr/local/texlive/2007/texmf-dist ls-files -z -m -o .|(cd /usr/local/texlive/2007/texmf-dist;git --git-dir=/home/tmp/texlive/.git update-index --add -z --stdin)

real    8m9.370s
user    2m1.172s
sys     0m25.138s
dak@lola:/home/tmp/texlive$ time git --work-tree=/usr/local/texlive/2007/texmf-dist ls-files -z -m -o .|(cd /usr/local/texlive/2007/texmf-dist;git --git-dir=/home/tmp/texlive/.git update-index --add -z --stdin)

real    6m4.447s
user    0m16.801s
sys     0m12.333s
dak@lola:/home/tmp/texlive$ 

[Hm.  Maybe "modified" files are not what I think they are?]

dak@lola:/home/tmp/texlive$ time git --work-tree=/usr/local/texlive/2007/texmf-dist ls-files -z -o .|(cd /usr/local/texlive/2007/texmf-dist;git --git-dir=/home/tmp/texlive/.git update-index --add -z --stdin)

real    6m0.120s
user    0m16.977s
sys     0m12.653s

[No, doesn't help.]

[Just for kicks, let's try getting the Linux scheduler out of our hair
in the initial case.]

dak@lola:/home/tmp/texlive$ time git --work-tree=/usr/local/texlive/2007/texmf-dist ls-files -z -m -o .|dd bs=8k|(cd /usr/local/texlive/2007/texmf-dist;git --git-dir=/home/tmp/texlive/.git update-index --add -z --stdin)
201+1 records in
201+1 records out
1650230 bytes (1.7 MB) copied, 513.125 seconds, 3.2 kB/s

real    8m45.088s
user    2m2.052s
sys     0m25.870s

[Hm, does more damage than it helps.]

So in summary: git-ls-files -m is apparently lacking the optimization
of git-add for unchanged inodes.  Bad.  Using it together with
git-update-index in the initial case saves some time over git-add, but
not breathtakingly so.  This is on a single core.

Most of the time is spent waiting for I/O.  Threaded execution should
supposedly help in having less waiting time, but at least in this
combination, the payoff does not seem overwhelming.

One should mention that the stuff I tested it on is actually sitting
on a reiserfs file system (though the repository is on ext3).

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux