Performance issue with excludes (was: Re: git-svn and submodules)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Oct 15, 2007, at 5:53 PM, Linus Torvalds wrote:

On Mon, 15 Oct 2007, Benoit SIGOURE wrote:

- git svn create-ignore (to create one .gitignore per directory from the svn:ignore properties. This has the disadvantage of committing the .gitignore during the next dcommit, but when you import a repo with tons of ignores (>1000), using git svn show-ignore to build .git/info/exclude is *not* a good idea, because things like git-status will end up doing >1000 fnmatch *per file* in the repo, which leads to git-status taking more than 4s on my
Core2Duo 2Ghz 2G RAM)

Ouch.

That sounds largely unavoidable.. *But*.

Maybe we have a bug here. In particular, we generally shouldn't care about the exclude/.gitignore file for ay paths that we know about, which means
that during an import, we really shouldn't ever even care about
.gitignore, since all the files are files we are expected to know about.

So yes, in general, "git status" is going to be slow in a tree that has
been built (since things like object files etc will have to be checked
against the exclude list! (*)), but if it's a clean import with no
generated files and only files we already know about, that should not be
the case.

I re-used the test that was posted some time ago:

------------------------------------------------------------------------ ---
#
# first create a tree of roughly 100k files
#
mkdir bummer
cd bummer
for ((i=0;i<100;i++)); do
mkdir $i && pushd $i;
for ((j=0;j<1000;j++)); do
echo "$j" >$j; done; popd;
done

#
# init and add this to git
#
time git init
git config user.email "no@thx"
git config user.name "nothx"
time git add .
time git commit -m 'buurrrrn' -a

for ((j=0;j<1000;j++)); do
  echo "/pattern$j" >.git/info/exclude
done

#
# git-status, tunes in at around ~8s for me
#
time git-status
time git-status
time git-status
------------------------------------------------------------------------ ---

[...]
git commit -m 'buurrrrn' -a 5.62s user 16.84s system 87% cpu 25.634 total
# On branch master
nothing to commit (working directory clean)
git-status  2.48s user 5.97s system 96% cpu 8.718 total
# On branch master
nothing to commit (working directory clean)
git-status  2.48s user 5.94s system 97% cpu 8.646 total
# On branch master
nothing to commit (working directory clean)
git-status  2.48s user 5.95s system 96% cpu 8.720 total

My machine is a Core2Duo 2Ghz 2G RAM.


So maybe we have a totally unnecessary performance issue, and do all the
fnmatch() on every path, whether we know about it or not?

		Linus

(*) It might be that we could also re-order the exclude list so that
entries that trigger are moved to the head of the list, because it's
likely that if you have tons of exclude entries, some of them trigger a lot more than others (ie "*.o"), and trying those first is likely a good
idea.

--
Benoit Sigoure aka Tsuna
EPITA Research and Development Laboratory


Attachment: PGP.sig
Description: This is a digitally signed message part


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux