On 6/28/2022 6:50 AM, Pavel Rappo wrote: Hi Pavel! Welcome. > I have a repo of the following characteristics: > > * 1 branch > * 100,000 commits This is not too large. > * 1TB in size This _is_ large. > * The tip of the branch has 55,000 files And again, this is not large. This means you have some very large files in your repo, perhaps even binary files that you don't intend to search. > * No new commits are expected: the repo is abandoned and kept for > archaeological purposes. > > Typically, a `git log -S/-G` lookup takes around a minute to complete. > I would like to significantly reduce that time. How can I do that? I > can spend up to 10x more disk space, if required. The machine has 10 > cores and 32GB of RAM. You are using -S<string> or -G<regex> to see which commits change the number of matches of that <string> or <regex>. If you don't provide a pathspec, then Git will search every changed file, including those very large binary files. Perhaps you'd like to start by providing a pathspec that limits the search to only the meaningful code files? As far as I know, Git doesn't have any data structures that can speed up content-based matches like this. The commit-graph's content-changed Bloom filters only help Git with questions like "did this specific file change?" which is not going to be a critical code path in what you're describing. I'm not sure what you're actually trying to ask with -S or -G, so maybe it is worth considering other types of queries, such as -L<n>,<m>:<file> or something. This is just a shot in the dark, as you might be doing the only thing you _can_ do to solve your problem. Thanks, -Stolee