Re: Use a *real* built-in diff generator

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Sat, 25 Mar 2006, Alex Riesen wrote:
> 
> Even more impressive on Cygwin (>50x!):
> 
> .../git-win$ time git --exec-path=$(pwd) diff initial.. > /dev/null
> real    0m1.485s
> user    0m0.567s
> sys     0m0.840s
> 
> ../git-win$ time git diff initial.. >/dev/null
> real    1m20.781s
> user    0m31.806s
> sys     0m20.717s

Yeah. That's the difference between "unusable" and "retty damn good".

Now, if we didn't even bother to write temporary files (and just did the 
object entirely in memory) I'd be even happier. I suspect it would help 
cygwin more too.

I've done a "strace" on "git-diff-tree" doing the 5-second diff of the 
kernel tree, and almost all of it looks like this:

	..
	open("/tmp/.diff_WgWi1X", O_RDWR|O_CREAT|O_EXCL, 0600) = 3
	write(3, "/*\n * Driver for Digigram pcxhr "..., 6121) = 6121
	close(3)                                = 0
	open("/tmp/.diff_hCzrFe", O_RDWR|O_CREAT|O_EXCL, 0600) = 3
	write(3, "/*\n * Driver for Digigram pcxhr "..., 6138) = 6138
	close(3)                                = 0
	rt_sigaction(SIGINT, {0x1000f650, [INT], SA_RESTART}, {0x1000f650, [INT], SA_RESTART}, 8) = 0
	open("/tmp/.diff_WgWi1X", O_RDONLY)     = 3
	fstat64(3, {st_mode=S_IFREG|0600, st_size=6121, ...}) = 0
	read(3, "/*\n * Driver for Digigram pcxhr "..., 6121) = 6121
	close(3)                                = 0
	open("/tmp/.diff_hCzrFe", O_RDONLY)     = 3
	fstat64(3, {st_mode=S_IFREG|0600, st_size=6138, ...}) = 0
	read(3, "/*\n * Driver for Digigram pcxhr "..., 6138) = 6138
	close(3)                                = 0
	unlink("/tmp/.diff_WgWi1X")             = 0
	unlink("/tmp/.diff_hCzrFe")             = 0
	..

which is just ridiculous. Those are _literally_ the only system calls we 
do any more after the conversion, if you ignore a few "brk()" calls here 
and there to allocate/free memory and obviously a number of "write(1,..." 
calls to actually write out the result!

(This is with a fully packed tree, so we just set up the object store with 
a single mmap at the beginning, which is why there are no reads to read 
the actual source contents).

Now, Linux is good at temp-files, but still: it adds nothing but overhead 
to first write out and then read back in over three _thousand_ filepairs 
(only to delete them immediately after reading), when the new code 
actually just wants to do the diff in memory anyway.

		Linus
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]