Parallelism for submodule update

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,
we are using git since many years with also heavily using submodules. 

When updating the submodules, only the fetching part is done in parallel (with config submodule.fetchjobs or --jobs) but the checkout is done sequentially

What I’ve recognized when cloning with
- scalar clone --full-clone --recurse-submodules <URL>
or
- git clone --filter=blob:none --also-filter-submodules --recurse-submodules <URL>

We loose performance, as the fetch of the blobs is done in the sequential checkout part, instead of in the parallel part.

Furthermore, the utilization - without partial clone - of network and harddisk is not always good, as first the network is utilized (fetch) and then the harddisk (checkout)

As the checkout part is local to the submodule (no shared resources to block), it would be great if we could move the checkout into the parallelized part.
E.g. by doing fetch and checkout (with blob fetching) in one step with e.g. run_processes_parallel_tr2

I expect that this significantly improves the performance, especially when using partial clones.

Do you think this is possible? Do I miss anything in my thoughts?

Best regards,

Christian Zitzmann






[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux