Re: [PATCH 4/2] grep: turn off threading for non-worktree

"J. Bruce Fields" <bfields@xxxxxxxxxxxx> · Wed, 7 Dec 2011 15:11:05 -0500

On Tue, Dec 06, 2011 at 11:42:42PM -0500, Jeff King wrote:
> On Wed, Dec 07, 2011 at 12:01:37AM +0100, René Scharfe wrote:
> 
> > Reading of git objects needs to be protected by an exclusive lock
> > and cannot be parallelized.  Searching the read buffers can be done
> > in parallel, but for simple expressions threading is a net loss due
> > to its overhead, as measured by Thomas.  Turn it off unless we're
> > searching in the worktree.
> 
> Based on my earlier numbers, I was going to complain that we should
> also be checking the "simple expressions" assumption here, as time spent
> in the actual regex might be important.
> 
> However, after trying to repeat my experiment, I think the numbers I
> posted earlier were misleading. For example, using my "more complex"
> regex of 'a.*b':
> 
>   $ time git grep --threads=8 'a.*b' HEAD >/dev/null
>   real    0m8.655s
>   user    0m23.817s
>   sys     0m0.480s

Dumb question (I missed the beginning of the conversation): what kind of
storage are you using, and is the data already cached?

I seem to recall part of the motivation for the multithreading being
NFS, where the goal isn't so much to keep CPU's busy as it is to keep
the network busy.

Probably a bigger problem for something like "git status" which I think
ends up doing a series of stat's (which can each require a round trip to
the server in the NFS case), as it is a problem for something like
git-grep that's also doing reads.

Just a plea for considering the IO cost as well when making these kinds
of decisions....

(Which maybe you already do, apologies again for just naively dropping
into the middle of a thread.)

--b.

> 
> Look at that sweet, sweet parallelism. It's a quad-core with
> hyperthreading, so we're not getting the 8x speedup we might hope for
> (presumably due to lock contention on extracting blobs), but hey, 3x
> isn't bad. Except, wait:
> 
>   $ time git grep --threads=0 'a.*b' HEAD >/dev/null
>   real    0m7.651s
>   user    0m7.600s
>   sys     0m0.048s
> 
> We can get 1x on a single core, but the total time is lower! This
> processor is an i7 with "turbo boost", which means it clocks higher in
> single-core mode than when multiple cores are active. So the numbers I
> posted earlier were misleading. Yes, we got parallelism, but at the cost
> of knocking the clock speed down for a net loss.
> 
> The sweet spot for me seems to be:
> 
>   $ time git grep --threads=2 'a.*b' HEAD >/dev/null
>   real    0m6.303s
>   user    0m11.129s
>   sys     0m0.220s
> 
> I'd be curious to see results from somebody with a quad-core (or more)
> without turbo boost; I suspect that threading may have more benefit
> there, even though we have some lock contention for blobs.
> 
> > --- a/builtin/grep.c
> > +++ b/builtin/grep.c
> > @@ -1048,7 +1048,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
> >  	nr_threads = 0;
> >  #else
> >  	if (nr_threads == -1)
> > -		nr_threads = (online_cpus() > 1) ? THREADS : 0;
> > +		nr_threads = (online_cpus() > 1 && !list.nr) ? THREADS : 0;
> >  
> >  	if (nr_threads > 0) {
> >  		opt.use_threads = 1;
> 
> This doesn't kick in for "--cached", which has the same performance
> characteristics as grepping a tree. I think you want to add "&& !cached" to
> the conditional.
> 
> -Peff
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html