performance problem: "git commit filename"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I thought we had fixed this long long ago, but if we did, it has 
re-surfaced.

Using an explicit filename with "git commit" is _extremely_ slow. Lookie 
here:

	[torvalds@woody linux]$ time git commit fs/exec.c
	no changes added to commit (use "git add" and/or "git commit -a")

	real    0m1.671s
	user    0m1.200s
	sys     0m0.328s

that's closer to two seconds on a fast machine, with the whole tree 
cached!

And for the uncached case, it's just unbearably slow: two and a half 
*minutes*.

In contrast, without the filename, it's much faster:

	[torvalds@woody linux]$ time git commit
	no changes added to commit (use "git add" and/or "git commit -a")
	
	real    0m0.387s
	user    0m0.220s
	sys     0m0.168s

with the cold-cache case now being "just" 18s (which is still long, but 
we're talking eight times faster, and certainly not unbearable!)

Doing an "strace -c" on the thing shows why. In the filename case, we 
have:

	% time     seconds  usecs/call     calls    errors syscall
	------ ----------- ----------- --------- --------- ----------------
	 32.69    0.000868           0     92299        37 lstat
	 17.40    0.000462           0     29958      3993 open
	 15.78    0.000419           0      5522           getdents
	 15.56    0.000413           0     23165           mmap
	 11.37    0.000302           0     23118           munmap
	  5.76    0.000153           0     25966         2 close
	  1.43    0.000038           0      2845           fstat
	...

and in the non-filename case we have

	% time     seconds  usecs/call     calls    errors syscall
	------ ----------- ----------- --------- --------- ----------------
	 53.67    0.000600           0     69227        31 lstat
	 23.35    0.000261           0      5522           getdents
	 11.09    0.000124           2        55           munmap
	  4.20    0.000047           0       285           write
	  3.31    0.000037           0      5537      2638 open
	  2.33    0.000026           0      2899         1 close
	  2.06    0.000023           0      2844           fstat
	...

notice how the expensive case has a lot of successful open/mmap/munmap 
calls: it is *literally* ignoring the valid entries in the old index 
entirely, and re-hashing every single file in the tree! No wonder it is 
slow!

Just counting "lstat()" calls, it's worth noticing that the non-filename 
case seems to do three lstat's for each index entry (and yes, that's two
too many), but the named file case has upped that to *four* lstats per 
entry, and then added the one open/mmap/munmap/close on top of that!

I'm pretty sure we didn't use to do things this badly. And if this is a 
regression like I think it is, it should be fixed before a real 1.5.4 
release.

I'll try to see if I can see what's up, but I thought I'd better let 
others know too, in case I don't have time. I *suspect* (but have nothing 
what-so-ever to back that up) that this happened as part of making commit 
a builtin.

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux