On Tue, Dec 06, 2011 at 11:42:42PM -0500, Jeff King wrote: > On Wed, Dec 07, 2011 at 12:01:37AM +0100, René Scharfe wrote: > > > Reading of git objects needs to be protected by an exclusive lock > > and cannot be parallelized. Searching the read buffers can be done > > in parallel, but for simple expressions threading is a net loss due > > to its overhead, as measured by Thomas. Turn it off unless we're > > searching in the worktree. > > Based on my earlier numbers, I was going to complain that we should > also be checking the "simple expressions" assumption here, as time spent > in the actual regex might be important. > > However, after trying to repeat my experiment, I think the numbers I > posted earlier were misleading. For example, using my "more complex" > regex of 'a.*b': > > $ time git grep --threads=8 'a.*b' HEAD >/dev/null > real 0m8.655s > user 0m23.817s > sys 0m0.480s Dumb question (I missed the beginning of the conversation): what kind of storage are you using, and is the data already cached? I seem to recall part of the motivation for the multithreading being NFS, where the goal isn't so much to keep CPU's busy as it is to keep the network busy. Probably a bigger problem for something like "git status" which I think ends up doing a series of stat's (which can each require a round trip to the server in the NFS case), as it is a problem for something like git-grep that's also doing reads. Just a plea for considering the IO cost as well when making these kinds of decisions.... (Which maybe you already do, apologies again for just naively dropping into the middle of a thread.) --b. > > Look at that sweet, sweet parallelism. It's a quad-core with > hyperthreading, so we're not getting the 8x speedup we might hope for > (presumably due to lock contention on extracting blobs), but hey, 3x > isn't bad. Except, wait: > > $ time git grep --threads=0 'a.*b' HEAD >/dev/null > real 0m7.651s > user 0m7.600s > sys 0m0.048s > > We can get 1x on a single core, but the total time is lower! This > processor is an i7 with "turbo boost", which means it clocks higher in > single-core mode than when multiple cores are active. So the numbers I > posted earlier were misleading. Yes, we got parallelism, but at the cost > of knocking the clock speed down for a net loss. > > The sweet spot for me seems to be: > > $ time git grep --threads=2 'a.*b' HEAD >/dev/null > real 0m6.303s > user 0m11.129s > sys 0m0.220s > > I'd be curious to see results from somebody with a quad-core (or more) > without turbo boost; I suspect that threading may have more benefit > there, even though we have some lock contention for blobs. > > > --- a/builtin/grep.c > > +++ b/builtin/grep.c > > @@ -1048,7 +1048,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix) > > nr_threads = 0; > > #else > > if (nr_threads == -1) > > - nr_threads = (online_cpus() > 1) ? THREADS : 0; > > + nr_threads = (online_cpus() > 1 && !list.nr) ? THREADS : 0; > > > > if (nr_threads > 0) { > > opt.use_threads = 1; > > This doesn't kick in for "--cached", which has the same performance > characteristics as grepping a tree. I think you want to add "&& !cached" to > the conditional. > > -Peff > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html