Re: [PATCH v2 0/4] submodule: parallelize diff

Calvin Wan <calvinwan@xxxxxxxxxx> · Thu, 13 Oct 2022 17:39:21 -0700

> Calvin Wan <calvinwan@xxxxxxxxxx> writes:
>
> > I also wanted to pose another question to list regarding defaults for
> > parallel processes. For jobs that clearly scale with the number of
> > processes (aka jobs that are mostly processor bound), it is obvious that
> > setting the default number of processes to the number of available cores
> > is the most optimal option. However, this changes when the job is mostly
> > I/O bound or has a combination of I/O and processing. Looking at my use
> > case for `status` on a cold cache (see below), we notice that increasing
> > the number of parallel processes speeds up status, but after a certain
> > number, it actually starts slowing down.
>
> I do not offhand recall how the default parallelism is computed
> there, but if I am correct to suspect that "git grep" has a similar
> scaling pattern, i.e. the threads all need to compete for I/O to
> read from the filesystem to find needles from the haystack, perhaps
> it would give us a precedent to model the behaviour of this part of
> the code, too, hopefully?

Setting grep.threads=0 does default it to the number of available cores
(at least the documentation is clear about this). I tested "git grep" on
my machine and found that it started slowing down after 4 threads --
this is most likely because my NVMe SSD uses 4 PCIe lanes aka it can at
most do 4 reads in parallel. AFAIK, there is no way to tell how many
reads a disk can do in parallel. This coupled with the fact that other
commands have varying levels of IO requirements makes it impossible to
set a "reasonable" amount of threads.