In builtin/grep.c:add_work() we pre-load the userdiff drivers before adding the grep_source in the todo list. This operation is currently being performed after acquiring the grep_mutex, but as it's already thread-safe, we don't need to protect it here. So let's move it out of the critical section which should avoid thread contention and improve performance. Running[1] `git grep --threads=8 abcd[02] HEAD` on chromium's repository[2], I got the following mean times for 30 executions after 2 warmups: Original | 6.2886s -------------------------|----------- Out of critical section | 5.7852s [1]: Tests performed on an i7-7700HQ with 16GB of RAM and SSD, running Manjaro Linux. [2]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”, 04-06-2019), after a 'git gc' execution. Signed-off-by: Matheus Tavares <matheus.bernardino@xxxxxx> --- builtin/grep.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 163f14b60d..d275b76647 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -92,8 +92,11 @@ static pthread_cond_t cond_result; static int skip_first_line; -static void add_work(struct grep_opt *opt, const struct grep_source *gs) +static void add_work(struct grep_opt *opt, struct grep_source *gs) { + if (opt->binary != GREP_BINARY_TEXT) + grep_source_load_driver(gs, opt->repo->index); + grep_lock(); while ((todo_end+1) % ARRAY_SIZE(todo) == todo_done) { @@ -101,9 +104,6 @@ static void add_work(struct grep_opt *opt, const struct grep_source *gs) } todo[todo_end].source = *gs; - if (opt->binary != GREP_BINARY_TEXT) - grep_source_load_driver(&todo[todo_end].source, - opt->repo->index); todo[todo_end].done = 0; strbuf_reset(&todo[todo_end].out); todo_end = (todo_end + 1) % ARRAY_SIZE(todo); -- 2.23.0