Re: [PATCH] Implement limited context matching in git-apply.

Linus Torvalds <torvalds@xxxxxxxx> · Tue, 11 Apr 2006 11:23:21 -0700 (PDT)

On Mon, 10 Apr 2006, Eric W. Biederman wrote:
> 
> So at a quick inspection it looks to me like:
> About .059s to perform to check for missing files.
> About .019s to write the new tree.
> About .155s in start up overhead, read_cache, and sanity checks.
> 
> So at a first glance it looks like librification to
> allow the redundant work to be skipped, is where
> the big speed win on my machine would be.

That sounded wrong to me, so I did a stupid patch to datestamp the 
different phases of git-write-tree, and here's what it says for me:

     0.000479 setup_git_directory
     0.008333 read_cache
     0.000813 ce_stage check
     0.001838 tree validity check
     0.037233 write_tree itself

	real    0m0.051s
	user    0m0.044s
	sys     0m0.008s

all times are in seconds. 

There is some overhead from the actual process startup (the timestamp 
numbers add up to 0.048696 seconds, which is less than the 0.051 reported 
by "time" - since I didn't datestamp everything), but the biggest chunk by 
far (about three quarters of the total time, including _all_ the setup 
like executing the process) is the actual call to write_tree() itself.

So it probably wouldn't actually be that big a win performance-wise to 
make write_tree() a library and call it directly from git-apply with some 
flag.

To really speed up write-tree, you'd have to know which trees to write, 
and just skip the rest (and know what SHA1's the ones you skipped had: 
it's not enough to just skip them, since you need the SHA1's of even the 
trees you skipped to write the parent tree, and you _will_ change at 
least the top parent tree if you had a valid patch).

Which would imply pretty major surgery - you'd have to add the tree entry 
information to the index file, and make sure they got invalidated properly 
(all the way to the root) whenever adding/deleting/updating a path in the 
index file.

Quite frankly, I don't think it's really worth it.

Yes, it would speed up applying of huge patch-sets, but it's not like 
we're really slow at that even now, and I suspect you'd be better off 
trying to either live with it, or trying to see if you could change your 
workflow. There clearly _are_ tools that are better at handling pure 
patches, with quilt being the obvious example.

I routinely apply 100-200 patches in a go, and that's fast enough to not 
even be an issue. Yes, I have reasonably fast hardware, but we're likely 
talking thousands of patches in a series for it to be _really_ painful 
even on pretty basic developer hardware. Even a slow machine should do a 
few hundred patches in a couple of minutes.

Maybe enough time to get a cup of coffee, but no more than it would take 
to compile the project.

			Linus
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html