This series adds helper workers to checkout, parallelizing the reading, filtering and writing of multiple blobs to the working tree. Since v1, I got the chance to benchmark parallel checkout in more machines. The results showed that the parallelization is most effective with repositories located on SSDs or over distributed file systems. For local file systems on spinning disks, it does not always bring good performances. In fact, it even brings a slowdown sometimes. But given the results on the two first cases, I think it's worth having the parallel code as an optional (and non-default) setting. The size of the repository being checked out and the compression level on the packfiles also influence how much performance gain we can get from parallel checkout. For example, downloading the Linux repo from GitHub and from kernel.org I got packfiles with 2.9GB and 1.4GB, respectively. The number of objects was the same, but GitHub's had a smaller number of delta-chains with size >= 7 [A]. For this reason, the sequential checkout after GitHub's clone was considerably faster than the sequential checkout after kernel.org's clone. And the speedup from parallel checkout was more modest (but it was faster in absolute values, nevertheless). [A]: https://docs.google.com/spreadsheets/d/1dDGLym77JAGCVYhKQHe44r3pqtrsvHrjS4NmD_Hqr6k/edit?usp=sharing V2 got bigger with tests and some additional optimizations, so I decided to divide the original series into two parts to facilitate reviewing. This one is constituted of: - The first 9 patches are preparatory steps in convert.c and entry.c. - The middle 6 actually implement parallel checkout. - The last 4 add tests. Part II will contain some extra optimizations, like work stealing and the creation of leading directories in parallel. With that, workers won't need to stat() the path components again before opening the files for writing. We will also skip some stat() calls during clone. Major changes since v1: General: - Added tests - Parallel checkout is no longer the default, since not all machines benefit from it. - Rebased on top of master to use the adjusted mem_pool API of en/mem-pool. Patch 10: - Converted BUG() to error(), in handle_results(), when we finish parallel checkout with pending entries. This is not really a BUG; it can happen when a worker dies before sending all of its results. Also, by emitting an error message instead of die()'ing, we can continue processing the next results and, thus, avoid wasting successful work. - Added missing initialization of ci->status on enqueue_entry(). - Fixed bug on which collision report during clone would not be correct when the file that is first written appears after it's colliding pair in the cache array. - Reworded commit message and added comment in handle_results() to explain why we retry writing entries with path collisions. - Renamed CI_RETRY to CI_COLLISION, to make it easier to change the behavior on collided entries in the future, if necessary. - Some other minor changes like: * Removed unnecessary PC_HANDLING_RESULTS status. * Statically allocated the global parallel_checkout struct. * Renamed checkout_item to parallel_checkout_item. Patch 11: - Made parse_and_save_result() safer by checking that the received data has the expected size, instead of trusting ci->status and possibly accessing an invalid address on errors. - Limited the workers to the number of enqueued entries. - Added comment in packet_to_ci() mentioning why it's OK to encode NULL as a zero length string when sending the working_tree_encoding to workers. - Split subprocess' spawning and finalizing loops, to mitigate the spawn/wait cost. - Don't die() when a worker exits with an error code (only report the error), to avoid wasting good work by not updating the index with the stat information from the written entries. - Renamed checkout.workersThreshold to checkout.thresholdForParallelism. Jeff Hostetler (4): convert: make convert_attrs() and convert structs public convert: add [async_]convert_to_working_tree_ca() variants convert: add get_stream_filter_ca() variant convert: add conv_attrs classification Matheus Tavares (15): entry: extract a header file for entry.c functions entry: make fstat_output() and read_blob_entry() public entry: extract cache_entry update from write_entry() entry: move conv_attrs lookup up to checkout_entry() entry: add checkout_entry_ca() which takes preloaded conv_attrs unpack-trees: add basic support for parallel checkout parallel-checkout: make it truly parallel parallel-checkout: support progress displaying make_transient_cache_entry(): optionally alloc from mem_pool builtin/checkout.c: complete parallel checkout support checkout-index: add parallel checkout support parallel-checkout: add tests for basic operations parallel-checkout: add tests related to clone collisions parallel-checkout: add tests related to .gitattributes ci: run test round with parallel-checkout enabled .gitignore | 1 + Documentation/config/checkout.txt | 21 + Makefile | 2 + apply.c | 1 + builtin.h | 1 + builtin/checkout--helper.c | 142 ++++++ builtin/checkout-index.c | 17 + builtin/checkout.c | 21 +- builtin/difftool.c | 3 +- cache.h | 34 +- ci/run-build-and-tests.sh | 1 + convert.c | 121 +++-- convert.h | 68 +++ entry.c | 102 ++-- entry.h | 54 ++ git.c | 2 + parallel-checkout.c | 631 ++++++++++++++++++++++++ parallel-checkout.h | 103 ++++ read-cache.c | 12 +- t/README | 4 + t/lib-encoding.sh | 25 + t/lib-parallel-checkout.sh | 45 ++ t/t0028-working-tree-encoding.sh | 25 +- t/t2080-parallel-checkout-basics.sh | 197 ++++++++ t/t2081-parallel-checkout-collisions.sh | 116 +++++ t/t2082-parallel-checkout-attributes.sh | 174 +++++++ unpack-trees.c | 22 +- 27 files changed, 1793 insertions(+), 152 deletions(-) create mode 100644 builtin/checkout--helper.c create mode 100644 entry.h create mode 100644 parallel-checkout.c create mode 100644 parallel-checkout.h create mode 100644 t/lib-encoding.sh create mode 100644 t/lib-parallel-checkout.sh create mode 100755 t/t2080-parallel-checkout-basics.sh create mode 100755 t/t2081-parallel-checkout-collisions.sh create mode 100755 t/t2082-parallel-checkout-attributes.sh -- 2.28.0