Re: [PATCH 3/5] parallel-checkout: add configuration options

Christian Couder <christian.couder@xxxxxxxxx> · Wed, 31 Mar 2021 06:33:43 +0200

On Wed, Mar 17, 2021 at 10:12 PM Matheus Tavares
<matheus.bernardino@xxxxxx> wrote:

> The above benchmarks show that parallel checkout is most effective on
> repositories located on an SSD or over a distributed file system. For
> local file systems on spinning disks, and/or older machines, the
> parallelism does not always bring a good performance. For this reason,
> the default value for checkout.workers is one, a.k.a. sequential
> checkout.

I wonder how many people are still using HDD, and how many will still
use them in a few years.

I think having 1 as the default value for checkout.workers might be
good for a while for backward compatibility and stability, while
people who are interested can test parallel checkout on different
environments. But we might want, in a few releases, after some bugs,
if any, have been fixed, to use a default, maybe 10, that will provide
significant speedup for most people, and will justify the added
complexity, especially as your numbers still show a small speedup for
HDD when using 10.

> To decide the default value for checkout.thresholdForParallelism,
> another benchmark was executed in the "Local SSD" setup, where parallel
> checkout showed to be beneficial. This time, we compared the runtime of
> a `git checkout -f`, with and without parallelism, after randomly
> removing an increasing number of files from the Linux working tree. The
> "sequential fallback" column bellow corresponds to the executions where

s/bellow/below/

> checkout.workers was 10 but checkout.thresholdForParallelism was equal
> to the number of to-be-updated files plus one (so that we end up writing
> sequentially). Each test case was sampled 15 times, and each sample had
> a randomly different set of files removed. Here are the results:
>
>              sequential fallback   10 workers           speedup
> 10   files    772.3 ms ± 12.6 ms   769.0 ms ± 13.6 ms   1.00 ± 0.02
> 20   files    780.5 ms ± 15.8 ms   775.2 ms ±  9.2 ms   1.01 ± 0.02
> 50   files    806.2 ms ± 13.8 ms   767.4 ms ±  8.5 ms   1.05 ± 0.02
> 100  files    833.7 ms ± 21.4 ms   750.5 ms ± 16.8 ms   1.11 ± 0.04
> 200  files    897.6 ms ± 30.9 ms   730.5 ms ± 14.7 ms   1.23 ± 0.05
> 500  files   1035.4 ms ± 48.0 ms   677.1 ms ± 22.3 ms   1.53 ± 0.09
> 1000 files   1244.6 ms ± 35.6 ms   654.0 ms ± 38.3 ms   1.90 ± 0.12
> 2000 files   1488.8 ms ± 53.4 ms   658.8 ms ± 23.8 ms   2.26 ± 0.12
>
> From the above numbers, 100 files seems to be a reasonable default value
> for the threshold setting.
>
> Note: Up to 1000 files, we observe a drop in the execution time of the
> parallel code with an increase in the number of files. This is a rather
> odd behavior, but it was observed in multiple repetitions. Above 1000
> files, the execution time increases according to the number of files, as
> one would expect.
>
> About the test environments: Local SSD tests were executed on an
> i7-7700HQ (4 cores with hyper-threading) running Manjaro Linux. Local
> HDD tests were executed on an Intel(R) Xeon(R) E3-1230 (also 4 cores
> with hyper-threading), HDD Seagate Barracuda 7200.14 SATA 3.1, running
> Debian. NFS and EFS tests were executed on an Amazon EC2 c5n.xlarge
> instance, with 4 vCPUs. The Linux NFS server was running on a m6g.large
> instance with 2 vCPUSs and a 1 TB EBS GP2 volume. Before each timing,
> the linux repository was removed (or checked out back to its previous
> state), and `sync && sysctl vm.drop_caches=3` was executed.

Thanks for all these tests and details!

> Co-authored-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx>
> Signed-off-by: Matheus Tavares <matheus.bernardino@xxxxxx>

[...]

> @@ -23,6 +25,19 @@ enum pc_status parallel_checkout_status(void)
>         return parallel_checkout.status;
>  }
>
> +#define DEFAULT_THRESHOLD_FOR_PARALLELISM 100

Using a "static const int" might be a bit better.

> +void get_parallel_checkout_configs(int *num_workers, int *threshold)
> +{
> +       if (git_config_get_int("checkout.workers", num_workers))
> +               *num_workers = 1;

I think it's better when an important default value like this "1" is
made more visible using a "static const int" or a "#define".

> +       else if (*num_workers < 1)
> +               *num_workers = online_cpus();
> +
> +       if (git_config_get_int("checkout.thresholdForParallelism", threshold))
> +               *threshold = DEFAULT_THRESHOLD_FOR_PARALLELISM;
> +}
> +
>  void init_parallel_checkout(void)
>  {
>         if (parallel_checkout.status != PC_UNINITIALIZED)
> @@ -554,11 +569,9 @@ static void write_items_sequentially(struct checkout *state)
>                 write_pc_item(&parallel_checkout.items[i], state);
>  }
>
> -#define DEFAULT_NUM_WORKERS 2

As I say above, it doesn't look like a good idea to have removed this.

> -int run_parallel_checkout(struct checkout *state)
> +int run_parallel_checkout(struct checkout *state, int num_workers, int threshold)
>  {
> -       int ret, num_workers = DEFAULT_NUM_WORKERS;
> +       int ret;
>
>         if (parallel_checkout.status != PC_ACCEPTING_ENTRIES)
>                 BUG("cannot run parallel checkout: uninitialized or already running");
> @@ -568,7 +581,7 @@ int run_parallel_checkout(struct checkout *state)
>         if (parallel_checkout.nr < num_workers)
>                 num_workers = parallel_checkout.nr;
>
> -       if (num_workers <= 1) {
> +       if (num_workers <= 1 || parallel_checkout.nr < threshold) {

Here we check the number of workers...

>                 write_items_sequentially(state);
>         } else {
>                 struct pc_worker *workers = setup_workers(state, num_workers);

[...]

> @@ -480,7 +483,8 @@ static int check_updates(struct unpack_trees_options *o,
>                 }
>         }
>         stop_progress(&progress);
> -       errs |= run_parallel_checkout(&state);
> +       if (pc_workers > 1)

...but the number of workers was already checked here.

> +               errs |= run_parallel_checkout(&state, pc_workers, pc_threshold);
>         errs |= finish_delayed_checkout(&state, NULL);
>         git_attr_set_direction(GIT_ATTR_CHECKIN);