> Calvin Wan <calvinwan@xxxxxxxxxx> writes: > > > I also wanted to pose another question to list regarding defaults for > > parallel processes. For jobs that clearly scale with the number of > > processes (aka jobs that are mostly processor bound), it is obvious that > > setting the default number of processes to the number of available cores > > is the most optimal option. However, this changes when the job is mostly > > I/O bound or has a combination of I/O and processing. Looking at my use > > case for `status` on a cold cache (see below), we notice that increasing > > the number of parallel processes speeds up status, but after a certain > > number, it actually starts slowing down. > > I do not offhand recall how the default parallelism is computed > there, but if I am correct to suspect that "git grep" has a similar > scaling pattern, i.e. the threads all need to compete for I/O to > read from the filesystem to find needles from the haystack, perhaps > it would give us a precedent to model the behaviour of this part of > the code, too, hopefully? Setting grep.threads=0 does default it to the number of available cores (at least the documentation is clear about this). I tested "git grep" on my machine and found that it started slowing down after 4 threads -- this is most likely because my NVMe SSD uses 4 PCIe lanes aka it can at most do 4 reads in parallel. AFAIK, there is no way to tell how many reads a disk can do in parallel. This coupled with the fact that other commands have varying levels of IO requirements makes it impossible to set a "reasonable" amount of threads.