Re: [PATCH 4/2] grep: turn off threading for non-worktree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 07.12.2011 05:42, schrieb Jeff King:
> On Wed, Dec 07, 2011 at 12:01:37AM +0100, René Scharfe wrote:
> 
>> Reading of git objects needs to be protected by an exclusive lock
>> and cannot be parallelized.  Searching the read buffers can be done
>> in parallel, but for simple expressions threading is a net loss due
>> to its overhead, as measured by Thomas.  Turn it off unless we're
>> searching in the worktree.
> 
> Based on my earlier numbers, I was going to complain that we should
> also be checking the "simple expressions" assumption here, as time spent
> in the actual regex might be important.
> 
> However, after trying to repeat my experiment, I think the numbers I
> posted earlier were misleading. For example, using my "more complex"
> regex of 'a.*b':
> 
>   $ time git grep --threads=8 'a.*b' HEAD >/dev/null
>   real    0m8.655s
>   user    0m23.817s
>   sys     0m0.480s
> 
> Look at that sweet, sweet parallelism. It's a quad-core with
> hyperthreading, so we're not getting the 8x speedup we might hope for
> (presumably due to lock contention on extracting blobs), but hey, 3x
> isn't bad. Except, wait:
> 
>   $ time git grep --threads=0 'a.*b' HEAD >/dev/null
>   real    0m7.651s
>   user    0m7.600s
>   sys     0m0.048s
> 
> We can get 1x on a single core, but the total time is lower! This
> processor is an i7 with "turbo boost", which means it clocks higher in
> single-core mode than when multiple cores are active. So the numbers I
> posted earlier were misleading. Yes, we got parallelism, but at the cost
> of knocking the clock speed down for a net loss.

Ugh, right, Turbo Boost complicates matters.

I don't understand the multiplied user time in the threaded case,
though.  Is it caused by busy-waiting?  Thomas reported similar numbers
earlier.

> The sweet spot for me seems to be:
> 
>   $ time git grep --threads=2 'a.*b' HEAD >/dev/null
>   real    0m6.303s
>   user    0m11.129s
>   sys     0m0.220s
> 
> I'd be curious to see results from somebody with a quad-core (or more)
> without turbo boost; I suspect that threading may have more benefit
> there, even though we have some lock contention for blobs.

It would be nice if we could come up with simple rules to calculate
defaults for the number of threads on a given run.  Users shouldn't have
to specify this option normally.  And it would be good if these rules
didn't require a list of all CPUs known to git. :)

>> --- a/builtin/grep.c
>> +++ b/builtin/grep.c
>> @@ -1048,7 +1048,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>>  	nr_threads = 0;
>>  #else
>>  	if (nr_threads == -1)
>> -		nr_threads = (online_cpus() > 1) ? THREADS : 0;
>> +		nr_threads = (online_cpus() > 1 && !list.nr) ? THREADS : 0;
>>  
>>  	if (nr_threads > 0) {
>>  		opt.use_threads = 1;
> 
> This doesn't kick in for "--cached", which has the same performance
> characteristics as grepping a tree. I think you want to add "&& !cached" to
> the conditional.

Oh, yes.

René
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]