Am 07.12.2011 05:42, schrieb Jeff King: > On Wed, Dec 07, 2011 at 12:01:37AM +0100, René Scharfe wrote: > >> Reading of git objects needs to be protected by an exclusive lock >> and cannot be parallelized. Searching the read buffers can be done >> in parallel, but for simple expressions threading is a net loss due >> to its overhead, as measured by Thomas. Turn it off unless we're >> searching in the worktree. > > Based on my earlier numbers, I was going to complain that we should > also be checking the "simple expressions" assumption here, as time spent > in the actual regex might be important. > > However, after trying to repeat my experiment, I think the numbers I > posted earlier were misleading. For example, using my "more complex" > regex of 'a.*b': > > $ time git grep --threads=8 'a.*b' HEAD >/dev/null > real 0m8.655s > user 0m23.817s > sys 0m0.480s > > Look at that sweet, sweet parallelism. It's a quad-core with > hyperthreading, so we're not getting the 8x speedup we might hope for > (presumably due to lock contention on extracting blobs), but hey, 3x > isn't bad. Except, wait: > > $ time git grep --threads=0 'a.*b' HEAD >/dev/null > real 0m7.651s > user 0m7.600s > sys 0m0.048s > > We can get 1x on a single core, but the total time is lower! This > processor is an i7 with "turbo boost", which means it clocks higher in > single-core mode than when multiple cores are active. So the numbers I > posted earlier were misleading. Yes, we got parallelism, but at the cost > of knocking the clock speed down for a net loss. Ugh, right, Turbo Boost complicates matters. I don't understand the multiplied user time in the threaded case, though. Is it caused by busy-waiting? Thomas reported similar numbers earlier. > The sweet spot for me seems to be: > > $ time git grep --threads=2 'a.*b' HEAD >/dev/null > real 0m6.303s > user 0m11.129s > sys 0m0.220s > > I'd be curious to see results from somebody with a quad-core (or more) > without turbo boost; I suspect that threading may have more benefit > there, even though we have some lock contention for blobs. It would be nice if we could come up with simple rules to calculate defaults for the number of threads on a given run. Users shouldn't have to specify this option normally. And it would be good if these rules didn't require a list of all CPUs known to git. :) >> --- a/builtin/grep.c >> +++ b/builtin/grep.c >> @@ -1048,7 +1048,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix) >> nr_threads = 0; >> #else >> if (nr_threads == -1) >> - nr_threads = (online_cpus() > 1) ? THREADS : 0; >> + nr_threads = (online_cpus() > 1 && !list.nr) ? THREADS : 0; >> >> if (nr_threads > 0) { >> opt.use_threads = 1; > > This doesn't kick in for "--cached", which has the same performance > characteristics as grepping a tree. I think you want to add "&& !cached" to > the conditional. Oh, yes. René -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html