This is the last part of the parallel checkout series. It's based on mt/parallel-checkout-part-2. This part adds parallel checkout support to "git checkout-index" and "git checkout <pathspec>", and adds tests for the parallel checkout framework. Changes since v2: Patch 2: - Mentioned in the commit message which checkout mode the patch is adding parallel checkout support for (i.e. pathspec-limited mode). Patch 4: t/lib-parallel-checkout.sh: - Added/improved usage message for the helper functions. - Renamed trace file in test_checkout_workers to avoid collisions. - At verify_checkout: * Used --ignore-submodules=none for the diff-index execution. (I hadn't noticed before that the default behavior changed to 'untracked' on git v2.31.0.) * Removed the `diff-index --cached` check as it was redundant. We are already implicitly checking the index by asserting that `git status` is empty after the `git diff-index HEAD` check. t/t2080: - Used `test_ln_s_add`, to avoid requiring the SYMLINKS prereq in all tests but one. - Grouped the basic checkout/clone tests in a for loop to make make it easier to see what are the differences between then (i.e. the different cases they are testing). - Added more entry type changes in the branch switch used to check parallel checkout correctness. - Added a test to ensure submodules can also use parallel checkout. - Compare the working trees with `git diff --no-index` instead of `diff -r` to avoid following symlinks. Patch 6: - Split patch into two: one moving the helper functions from t0028 to lib-encoding.sh, and another actually adding the t2082 tests. Matheus Tavares (8): make_transient_cache_entry(): optionally alloc from mem_pool builtin/checkout.c: complete parallel checkout support checkout-index: add parallel checkout support parallel-checkout: add tests for basic operations parallel-checkout: add tests related to path collisions t0028: extract encoding helpers to lib-encoding.sh parallel-checkout: add tests related to .gitattributes ci: run test round with parallel-checkout enabled builtin/checkout--worker.c | 2 +- builtin/checkout-index.c | 24 ++- builtin/checkout.c | 20 ++- builtin/difftool.c | 2 +- cache.h | 11 +- ci/run-build-and-tests.sh | 1 + parallel-checkout.c | 18 ++ read-cache.c | 12 +- t/README | 4 + t/lib-encoding.sh | 25 +++ t/lib-parallel-checkout.sh | 45 +++++ t/t0028-working-tree-encoding.sh | 25 +-- t/t2080-parallel-checkout-basics.sh | 229 ++++++++++++++++++++++++ t/t2081-parallel-checkout-collisions.sh | 162 +++++++++++++++++ t/t2082-parallel-checkout-attributes.sh | 194 ++++++++++++++++++++ unpack-trees.c | 2 +- 16 files changed, 727 insertions(+), 49 deletions(-) create mode 100644 t/lib-encoding.sh create mode 100644 t/lib-parallel-checkout.sh create mode 100755 t/t2080-parallel-checkout-basics.sh create mode 100755 t/t2081-parallel-checkout-collisions.sh create mode 100755 t/t2082-parallel-checkout-attributes.sh Range-diff against v1: 1: f870040bfb = 1: f870040bfb make_transient_cache_entry(): optionally alloc from mem_pool 2: 5e0dee7beb ! 2: e2d82c4337 builtin/checkout.c: complete parallel checkout support @@ Metadata ## Commit message ## builtin/checkout.c: complete parallel checkout support - There is one code path in builtin/checkout.c which still doesn't benefit - from parallel checkout because it calls checkout_entry() directly, - instead of unpack_trees(). Let's add parallel checkout support for this - missing spot as well. Note: the transient cache entries allocated in - checkout_merged() are now allocated in a mem_pool which is only - discarded after parallel checkout finishes. This is done because the - entries need to be valid when run_parallel_checkout() is called. + Pathspec-limited checkouts (like `git checkout *.txt`) are performed by + a code path that doesn't yet support parallel checkout because it calls + checkout_entry() directly, instead of unpack_trees(). Let's add parallel + checkout support for this code path too. + + Note: the transient cache entries allocated in checkout_merged() are now + allocated in a mem_pool which is only discarded after parallel checkout + finishes. This is done because the entries need to be valid when + run_parallel_checkout() is called. Signed-off-by: Matheus Tavares <matheus.bernardino@xxxxxx> 3: 6bf1db2fd8 = 3: 0fe1a5fabc checkout-index: add parallel checkout support 4: 46128ba8d8 ! 4: f604c50dba parallel-checkout: add tests for basic operations @@ Commit message included by t2080 for now. But they will also be used by other parallel-checkout tests in the following patches. - Original-patch-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> - Signed-off-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> + Co-authored-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> Signed-off-by: Matheus Tavares <matheus.bernardino@xxxxxx> ## t/lib-parallel-checkout.sh (new) ## @@ -+# Helpers for t208* tests ++# Helpers for tests invoking parallel-checkout + +set_checkout_config () { + if test $# -ne 2 + then -+ BUG "set_checkout_config() requires two arguments" ++ BUG "usage: set_checkout_config <workers> <threshold>" + fi && + + test_config_global checkout.workers $1 && @@ t/lib-parallel-checkout.sh (new) +test_checkout_workers () { + if test $# -lt 2 + then -+ BUG "too few arguments to test_checkout_workers()" ++ BUG "too few arguments to test_checkout_workers" + fi && + -+ expected_workers=$1 && ++ local expected_workers=$1 && + shift && + -+ rm -f trace && -+ GIT_TRACE2="$(pwd)/trace" "$@" && ++ local trace_file=trace-test-checkout-workers && ++ rm -f "$trace_file" && ++ GIT_TRACE2="$(pwd)/$trace_file" "$@" && + -+ workers=$(grep "child_start\[..*\] git checkout--worker" trace | wc -l) && ++ local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) && + test $workers -eq $expected_workers && -+ rm -f trace ++ rm "$trace_file" +} + +# Verify that both the working tree and the index were created correctly +verify_checkout () { -+ git -C "$1" diff-index --quiet HEAD -- && -+ git -C "$1" diff-index --quiet --cached HEAD -- && ++ if test $# -ne 1 ++ then ++ BUG "usage: verify_checkout <repository path>" ++ fi && ++ ++ git -C "$1" diff-index --ignore-submodules=none --exit-code HEAD -- && + git -C "$1" status --porcelain >"$1".status && + test_must_be_empty "$1".status +} @@ t/t2080-parallel-checkout-basics.sh (new) +. ./test-lib.sh +. "$TEST_DIRECTORY/lib-parallel-checkout.sh" + -+# Test parallel-checkout with a branch switch containing file creations, -+# deletions, and modification; with different entry types. Switching from B1 to -+# B2 will have the following changes: ++# Test parallel-checkout with a branch switch containing a variety of file ++# creations, deletions, and modifications, involving different entry types. ++# The branches B1 and B2 have the following paths: ++# ++# B1 B2 ++# a/a (file) a (file) ++# b (file) b/b (file) ++# ++# c/c (file) c (symlink) ++# d (symlink) d/d (file) ++# ++# e/e (file) e (submodule) ++# f (submodule) f/f (file) ++# ++# g (submodule) g (symlink) ++# h (symlink) h (submodule) ++# ++# Additionally, the following paths are present on both branches, but with ++# different contents: ++# ++# i (file) i (file) ++# j (symlink) j (symlink) ++# k (submodule) k (submodule) +# -+# - a (file): modified -+# - e/x (file): deleted -+# - b (symlink): deleted -+# - b/f (file): created -+# - e (symlink): created -+# - d (submodule): created ++# And the following paths are only present in one of the branches: +# -+test_expect_success SYMLINKS 'setup repo for checkout with various types of changes' ' ++# l/l (file) - ++# - m/m (file) ++# ++test_expect_success 'setup repo for checkout with various types of changes' ' ++ git init sub && ++ ( ++ cd sub && ++ git checkout -b B2 && ++ echo B2 >file && ++ git add file && ++ git commit -m file && ++ ++ git checkout -b B1 && ++ echo B1 >file && ++ git add file && ++ git commit -m file ++ ) && ++ + git init various && + ( + cd various && ++ + git checkout -b B1 && -+ echo a >a && -+ mkdir e && -+ echo e/x >e/x && -+ ln -s e b && -+ git add -A && ++ mkdir a c e && ++ echo a/a >a/a && ++ echo b >b && ++ echo c/c >c/c && ++ test_ln_s_add c d && ++ echo e/e >e/e && ++ git submodule add ../sub f && ++ git submodule add ../sub g && ++ test_ln_s_add c h && ++ ++ echo "B1 i" >i && ++ test_ln_s_add c j && ++ git submodule add -b B1 ../sub k && ++ mkdir l && ++ echo l/l >l/l && ++ ++ git add . && + git commit -m B1 && + + git checkout -b B2 && -+ echo modified >a && -+ rm -rf e && -+ rm b && -+ mkdir b && -+ echo b/f >b/f && -+ ln -s b e && -+ git init d && -+ test_commit -C d f && -+ git submodule add ./d && -+ git add -A && ++ git rm -rf :^.gitmodules :^k && ++ mkdir b d f && ++ echo a >a && ++ echo b/b >b/b && ++ test_ln_s_add b c && ++ echo d/d >d/d && ++ git submodule add ../sub e && ++ echo f/f >f/f && ++ test_ln_s_add b g && ++ git submodule add ../sub h && ++ ++ echo "B2 i" >i && ++ test_ln_s_add b j && ++ git -C k checkout B2 && ++ mkdir m && ++ echo m/m >m/m && ++ ++ git add . && + git commit -m B2 && + + git checkout --recurse-submodules B1 + ) +' + -+test_expect_success SYMLINKS 'sequential checkout' ' -+ cp -R various various_sequential && -+ set_checkout_config 1 0 && -+ test_checkout_workers 0 \ -+ git -C various_sequential checkout --recurse-submodules B2 && -+ verify_checkout various_sequential -+' ++for mode in sequential parallel sequential-fallback ++do ++ case $mode in ++ sequential) workers=1 threshold=0 expected_workers=0 ;; ++ parallel) workers=2 threshold=0 expected_workers=2 ;; ++ sequential-fallback) workers=2 threshold=100 expected_workers=0 ;; ++ esac + -+test_expect_success SYMLINKS 'parallel checkout' ' -+ cp -R various various_parallel && -+ set_checkout_config 2 0 && -+ test_checkout_workers 2 \ -+ git -C various_parallel checkout --recurse-submodules B2 && -+ verify_checkout various_parallel -+' ++ test_expect_success "$mode checkout" ' ++ repo=various_$mode && ++ cp -R various $repo && + -+test_expect_success SYMLINKS 'fallback to sequential checkout (threshold)' ' -+ cp -R various various_sequential_fallback && -+ set_checkout_config 2 100 && -+ test_checkout_workers 0 \ -+ git -C various_sequential_fallback checkout --recurse-submodules B2 && -+ verify_checkout various_sequential_fallback -+' ++ # The just copied files have more recent timestamps than their ++ # associated index entries. So refresh the cached timestamps ++ # to avoid an "entry not up-to-date" error from `git checkout`. ++ # We only have to do this for the submodules as `git checkout` ++ # will already refresh the superproject index before performing ++ # the up-to-date check. ++ # ++ git -C $repo submodule foreach "git update-index --refresh" && + -+test_expect_success SYMLINKS 'parallel checkout on clone' ' -+ git -C various checkout --recurse-submodules B2 && -+ set_checkout_config 2 0 && -+ test_checkout_workers 2 \ -+ git clone --recurse-submodules various various_parallel_clone && -+ verify_checkout various_parallel_clone -+' ++ set_checkout_config $workers $threshold && ++ test_checkout_workers $expected_workers \ ++ git -C $repo checkout --recurse-submodules B2 && ++ verify_checkout $repo ++ ' ++done + -+test_expect_success SYMLINKS 'fallback to sequential checkout on clone (threshold)' ' -+ git -C various checkout --recurse-submodules B2 && -+ set_checkout_config 2 100 && -+ test_checkout_workers 0 \ -+ git clone --recurse-submodules various various_sequential_fallback_clone && -+ verify_checkout various_sequential_fallback_clone -+' ++for mode in parallel sequential-fallback ++do ++ case $mode in ++ parallel) workers=2 threshold=0 expected_workers=2 ;; ++ sequential-fallback) workers=2 threshold=100 expected_workers=0 ;; ++ esac ++ ++ test_expect_success "$mode checkout on clone" ' ++ repo=various_${mode}_clone && ++ set_checkout_config $workers $threshold && ++ test_checkout_workers $expected_workers \ ++ git clone --recurse-submodules --branch B2 various $repo && ++ verify_checkout $repo ++ ' ++done + +# Just to be paranoid, actually compare the working trees' contents directly. -+test_expect_success SYMLINKS 'compare the working trees' ' ++test_expect_success 'compare the working trees' ' + rm -rf various_*/.git && -+ rm -rf various_*/d/.git && ++ rm -rf various_*/*/.git && ++ ++ # We use `git diff` instead of `diff -r` because the latter would ++ # follow symlinks, and not all `diff` implementations support the ++ # `--no-dereference` option. ++ # ++ git diff --no-index various_sequential various_parallel && ++ git diff --no-index various_sequential various_parallel_clone && ++ git diff --no-index various_sequential various_sequential-fallback && ++ git diff --no-index various_sequential various_sequential-fallback_clone ++' + -+ diff -r various_sequential various_parallel && -+ diff -r various_sequential various_sequential_fallback && -+ diff -r various_sequential various_parallel_clone && -+ diff -r various_sequential various_sequential_fallback_clone ++# Currently, each submodule is checked out in a separated child process, but ++# these subprocesses must also be able to use parallel checkout workers to ++# write the submodules' entries. ++test_expect_success 'submodules can use parallel checkout' ' ++ set_checkout_config 2 0 && ++ git init super && ++ ( ++ cd super && ++ git init sub && ++ test_commit -C sub A && ++ test_commit -C sub B && ++ git submodule add ./sub && ++ git commit -m sub && ++ rm sub/* && ++ test_checkout_workers 2 git checkout --recurse-submodules . ++ ) +' + +test_expect_success 'parallel checkout respects --[no]-force' ' 5: 71be08b895 ! 5: eae6d3a1c1 parallel-checkout: add tests related to path collisions @@ Commit message checkout workers, both to avoid race conditions and to report colliding entries on clone. - Original-patch-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> - Signed-off-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> + Co-authored-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> Signed-off-by: Matheus Tavares <matheus.bernardino@xxxxxx> ## parallel-checkout.c ## @@ parallel-checkout.c: void write_pc_item(struct parallel_checkout_item *pc_item, ## t/lib-parallel-checkout.sh ## @@ t/lib-parallel-checkout.sh: test_checkout_workers () { - shift && - rm -f trace && -- GIT_TRACE2="$(pwd)/trace" "$@" && -+ GIT_TRACE2="$(pwd)/trace" "$@" 2>&8 && + local trace_file=trace-test-checkout-workers && + rm -f "$trace_file" && +- GIT_TRACE2="$(pwd)/$trace_file" "$@" && ++ GIT_TRACE2="$(pwd)/$trace_file" "$@" 2>&8 && - workers=$(grep "child_start\[..*\] git checkout--worker" trace | wc -l) && + local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) && test $workers -eq $expected_workers && - rm -f trace + rm "$trace_file" -} +} 8>&2 2>&4 -: ---------- > 6: 9161cd1503 t0028: extract encoding helpers to lib-encoding.sh 6: 1dedfd994a ! 7: bc584897e6 parallel-checkout: add tests related to .gitattributes @@ Commit message require external filters, are correctly smudge and written when parallel-checkout is enabled. - Note: to avoid repeating code, some helper functions are extracted from - t0028 into a common lib file. - - Original-patch-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> - Signed-off-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> + Co-authored-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> Signed-off-by: Matheus Tavares <matheus.bernardino@xxxxxx> - ## t/lib-encoding.sh (new) ## -@@ -+# Encoding helpers used by t0028 and t2082 -+ -+test_lazy_prereq NO_UTF16_BOM ' -+ test $(printf abc | iconv -f UTF-8 -t UTF-16 | wc -c) = 6 -+' -+ -+test_lazy_prereq NO_UTF32_BOM ' -+ test $(printf abc | iconv -f UTF-8 -t UTF-32 | wc -c) = 12 -+' -+ -+write_utf16 () { -+ if test_have_prereq NO_UTF16_BOM -+ then -+ printf '\376\377' -+ fi && -+ iconv -f UTF-8 -t UTF-16 -+} -+ -+write_utf32 () { -+ if test_have_prereq NO_UTF32_BOM -+ then -+ printf '\0\0\376\377' -+ fi && -+ iconv -f UTF-8 -t UTF-32 -+} - - ## t/t0028-working-tree-encoding.sh ## -@@ t/t0028-working-tree-encoding.sh: GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main - export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME - - . ./test-lib.sh -+. "$TEST_DIRECTORY/lib-encoding.sh" - - GIT_TRACE_WORKING_TREE_ENCODING=1 && export GIT_TRACE_WORKING_TREE_ENCODING - --test_lazy_prereq NO_UTF16_BOM ' -- test $(printf abc | iconv -f UTF-8 -t UTF-16 | wc -c) = 6 --' -- --test_lazy_prereq NO_UTF32_BOM ' -- test $(printf abc | iconv -f UTF-8 -t UTF-32 | wc -c) = 12 --' -- --write_utf16 () { -- if test_have_prereq NO_UTF16_BOM -- then -- printf '\376\377' -- fi && -- iconv -f UTF-8 -t UTF-16 --} -- --write_utf32 () { -- if test_have_prereq NO_UTF32_BOM -- then -- printf '\0\0\376\377' -- fi && -- iconv -f UTF-8 -t UTF-32 --} -- - test_expect_success 'setup test files' ' - git config core.eol lf && - - ## t/t2082-parallel-checkout-attributes.sh (new) ## @@ +#!/bin/sh 7: 3cc5b10081 ! 8: 1bc5c523c5 ci: run test round with parallel-checkout enabled @@ t/README: and "sha256". ## t/lib-parallel-checkout.sh ## @@ - # Helpers for t208* tests + # Helpers for tests invoking parallel-checkout +# Parallel checkout tests need full control of the number of workers +unset GIT_TEST_CHECKOUT_WORKERS -- 2.30.1