Re: [PATCH] Implement limited context matching in git-apply.

ebiederm@xxxxxxxxxxxx (Eric W. Biederman) · Thu, 13 Apr 2006 06:02:51 -0600

Linus Torvalds <torvalds@xxxxxxxx> writes:

> On Mon, 10 Apr 2006, Eric W. Biederman wrote:
>> 
>> So at a quick inspection it looks to me like:
>> About .059s to perform to check for missing files.
>> About .019s to write the new tree.
>> About .155s in start up overhead, read_cache, and sanity checks.
>> 
>> So at a first glance it looks like librification to
>> allow the redundant work to be skipped, is where
>> the big speed win on my machine would be.
>
> That sounded wrong to me, so I did a stupid patch to datestamp the 
> different phases of git-write-tree, and here's what it says for me:
>
>      0.000479 setup_git_directory
>      0.008333 read_cache
>      0.000813 ce_stage check
>      0.001838 tree validity check
>      0.037233 write_tree itself
>
> 	real    0m0.051s
> 	user    0m0.044s
> 	sys     0m0.008s
>
> all times are in seconds. 

Ok.  This is interesting and probably reveals what is different
about my setup.  For you user+sys = real.  For me there was
a significant gap.  So it looks like for some reason I was not
succeeding in keeping .git/index hot in the page cache.

When you are I/O bound it does make sense for read_cache
to be the dominate time.  I just need to track what is up
with my machine that makes me I/O bound.  Having too little
ram is an obvious candidate but it is too simple.  Currently
out of 512M I only have 21M in the page cache which sounds
really low.  Something for me to look at.

> Which would imply pretty major surgery - you'd have to add the tree entry 
> information to the index file, and make sure they got invalidated properly 
> (all the way to the root) whenever adding/deleting/updating a path in the 
> index file.
>
> Quite frankly, I don't think it's really worth it.

For the current size of the kernel tree I agree.

It is a potential scaling limitation and if someone starts
tracking really big tress with git it may be worth revisiting.

> Yes, it would speed up applying of huge patch-sets, but it's not like 
> we're really slow at that even now, and I suspect you'd be better off 
> trying to either live with it, or trying to see if you could change your 
> workflow. There clearly _are_ tools that are better at handling pure 
> patches, with quilt being the obvious example.

Probably.  For my workflow not having to switch tool chains is
the biggest win.  Which is part of what the -C is about.

> I routinely apply 100-200 patches in a go, and that's fast enough to not 
> even be an issue. Yes, I have reasonably fast hardware, but we're likely 
> talking thousands of patches in a series for it to be _really_ painful 
> even on pretty basic developer hardware. Even a slow machine should do a 
> few hundred patches in a couple of minutes.
>
> Maybe enough time to get a cup of coffee, but no more than it would take 
> to compile the project.

Agreed.  I did the analysis so I could understand what was going on.
If the analysis revealed low hanging fruit I would have plucked it.

Eric
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html