2011/12/23 Thomas Rast <trast@xxxxxxxxxxxxxxx>: > Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > >> On Fri, Dec 2, 2011 at 14:07, Thomas Rast <trast@xxxxxxxxxxxxxxx> wrote: >> >>> I conjecture that this is caused by contention on >>> read_sha1_mutex. [...] So disable threading entirely when not >>> scanning the worktree >> >> Why does git-grep even need to keep a mutex to call read_sha1_file()? >> It's inherently a read-only operation isn't it? If the lock is needed >> because data is being shared between threads in sha1_file.c shouldn't >> we tackle that instead of completely disabling threading? > > The problem is that all sorts of data is shared. See > > http://thread.gmane.org/gmane.comp.version-control.git/186618 > > But I need to go through it again, there are some races and double locks > in the posted version. I mentioned this on IRC, but I thought I'd bring it up here too. Is the expensive part of git-grep all the setup work, or the actual traversal and searching? I'm guessing it's the latter. In that case an easy way to do git-grep in parallel would be to simply spawn multiple sub-processes, e.g. if we had 1000 files and 4 cores: 1. Split the 1000 into 4 parts 250 each. 2. Spawn 4 processes as: git grep <pattern> -- <250 files> 3. Aggregate all of the results in the parent process -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html