Re: [PATCH v4] Threaded grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Tue, 26 Jan 2010, Benjamin Kramer wrote:
> 
> BSD and glibc have an "REG_STARTEND" flag to do that. I made a small
> PoC patch to use it if it's available but it didn't give any significant
> speedup on my system.

Goodie.  It's noticeable for me. This is what I reported earlier:

> > $ /usr/bin/time git grep void
> 
> Before:
> 
>         real    0m1.144s
>         user    0m0.988s
>         sys     0m0.148s
> 
> After:
>         real    0m0.290s
>         user    0m1.732s
>         sys     0m0.232s

and with your patch I get

	real	0m0.239s
	user	0m1.392s
	sys	0m0.276s

and the profile shows no strlen in it:

    57.12%      git  libc-2.11.1.so                 [.] re_search_internal
     5.59%      git  [kernel]                       [k] copy_user_generic_string
     4.09%      git  [kernel]                       [k] _raw_spin_lock
     2.57%      git  [kernel]                       [k] intel_pmu_enable_all
     2.46%      git  [kernel]                       [k] __d_lookup
     1.94%      git  libc-2.11.1.so                 [.] re_string_reconstruct
     1.87%      git  [kernel]                       [k] kmem_cache_alloc
     1.68%      git  libc-2.11.1.so                 [.] _int_free
     1.53%      git  [kernel]                       [k] find_get_page
     1.43%      git  [kernel]                       [k] update_curr
     1.27%      git  libc-2.11.1.so                 [.] __GI___libc_malloc
     1.17%      git  [kernel]                       [k] _atomic_dec_and_lock
     1.00%      git  libc-2.11.1.so                 [.] __GI_memcpy

Side note: the tailing end of the profiles aren't very stable, probably 
because the grep executes so quickly and in so many threads, so the 
functions in the one-percent range will move up and down the list 
depending on just exactly where we happened to get profile hits. 
Similarly, the raw_spin_lock numbers vary.

But the big picture is stable, and that 57% number (and the nonlock 
copy_user_generic_string) is consistent. And your patch definitely helped 
both actual performance and is visible in the profile: re_search_internal 
went from ~52% to ~57%.

So ack on that patch. Looks like a good thing to do, and with the #ifdef, 
it looks like it should just automatically DTRT based on regexec 
implementation.

		Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]