Re: [PATCH 02/20] t/perf: add performance test for sparse operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 23, 2021 at 12:14 PM Derrick Stolee via GitGitGadget
<gitgitgadget@xxxxxxxxx> wrote:
>
> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
>
> Create a test script that takes the default performance test (the Git
> codebase) and multiplies it by 256 using four layers of duplicated
> trees of width four. This results in nearly one million blob entries in
> the index. Then, we can clone this repository with sparse-checkout
> patterns that demonstrate four copies of the initial repository. Each
> clone will use a different index format or mode so peformance can be
> tested across the different options.
>
> Note that the initial repo is stripped of submodules before doing the
> copies. This preserves the expected data shape of the sparse index,
> because directories containing submodules are not collapsed to a sparse
> directory entry.
>
> Run a few Git commands on these clones, especially those that use the
> index (status, add, commit).
>
> Here are the results on my Linux machine:
>
> Test
> --------------------------------------------------------------
> 2000.2: git status (full-index-v3)             0.37(0.30+0.09)
> 2000.3: git status (full-index-v4)             0.39(0.32+0.10)
> 2000.4: git add -A (full-index-v3)             1.42(1.06+0.20)
> 2000.5: git add -A (full-index-v4)             1.26(0.98+0.16)
> 2000.6: git add . (full-index-v3)              1.40(1.04+0.18)
> 2000.7: git add . (full-index-v4)              1.26(0.98+0.17)
> 2000.8: git commit -a -m A (full-index-v3)     1.42(1.11+0.16)
> 2000.9: git commit -a -m A (full-index-v4)     1.33(1.08+0.16)
>
> It is perhaps noteworthy that there is an improvement when using index
> version 4. This is because the v3 index uses 108 MiB while the v4
> index uses 80 MiB. Since the repeated portions of the directories are
> very short (f3/f1/f2, for example) this ratio is less pronounced than in
> similarly-sized real repositories.
>
> Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
> ---
>  t/perf/p2000-sparse-operations.sh | 87 +++++++++++++++++++++++++++++++
>  1 file changed, 87 insertions(+)
>  create mode 100755 t/perf/p2000-sparse-operations.sh
>
> diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
> new file mode 100755
> index 000000000000..52597683376e
> --- /dev/null
> +++ b/t/perf/p2000-sparse-operations.sh
> @@ -0,0 +1,87 @@
> +#!/bin/sh
> +
> +test_description="test performance of Git operations using the index"
> +
> +. ./perf-lib.sh
> +
> +test_perf_default_repo
> +
> +SPARSE_CONE=f2/f4/f1
> +
> +test_expect_success 'setup repo and indexes' '
> +       git reset --hard HEAD &&
> +       # Remove submodules from the example repo, because our
> +       # duplication of the entire repo creates an unlikly data shape.
> +       git config --file .gitmodules --get-regexp "submodule.*.path" >modules &&
> +       rm -f .gitmodules &&
> +       git add .gitmodules &&

Why not `git rm [-f] .gitmodules` instead of these two commands?  Is
there something special about .gitmodules that requires this special
handling?

> +       for module in $(awk "{print \$2}" modules)
> +       do
> +               git rm $module || return 1
> +       done &&
> +       git add . &&

What does the `git add .` do?  I don't see any changes there weren't
already git-add'ed or git-rm'ed.

> +       git commit -m "remove submodules" &&
> +
> +       echo bogus >a &&
> +       cp a b &&
> +       git add a b &&
> +       git commit -m "level 0" &&
> +       BLOB=$(git rev-parse HEAD:a) &&
> +       OLD_COMMIT=$(git rev-parse HEAD) &&
> +       OLD_TREE=$(git rev-parse HEAD^{tree}) &&
> +
> +       for i in $(test_seq 1 4)
> +       do
> +               cat >in <<-EOF &&
> +                       100755 blob $BLOB       a
> +                       040000 tree $OLD_TREE   f1
> +                       040000 tree $OLD_TREE   f2
> +                       040000 tree $OLD_TREE   f3
> +                       040000 tree $OLD_TREE   f4
> +               EOF
> +               NEW_TREE=$(git mktree <in) &&
> +               NEW_COMMIT=$(git commit-tree $NEW_TREE -p $OLD_COMMIT -m "level $i") &&
> +               OLD_TREE=$NEW_TREE &&
> +               OLD_COMMIT=$NEW_COMMIT || return 1
> +       done &&
> +
> +       git sparse-checkout init --cone &&
> +       git branch -f wide $OLD_COMMIT &&
> +       git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v3 &&
> +       (
> +               cd full-index-v3 &&
> +               git sparse-checkout init --cone &&
> +               git sparse-checkout set $SPARSE_CONE &&
> +               git config index.version 3 &&
> +               git update-index --index-version=3
> +       ) &&
> +       git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v4 &&
> +       (
> +               cd full-index-v4 &&
> +               git sparse-checkout init --cone &&
> +               git sparse-checkout set $SPARSE_CONE &&
> +               git config index.version 4 &&
> +               git update-index --index-version=4
> +       )
> +'
> +
> +test_perf_on_all () {
> +       command="$@"
> +       for repo in full-index-v3 full-index-v4
> +       do
> +               test_perf "$command ($repo)" "
> +                       (
> +                               cd $repo &&
> +                               echo >>$SPARSE_CONE/a &&
> +                               $command
> +                       )
> +               "
> +       done
> +}
> +
> +test_perf_on_all git status
> +test_perf_on_all git add -A
> +test_perf_on_all git add .
> +test_perf_on_all git commit -a -m A
> +
> +test_done
> --
> gitgitgadget

Other than the two minor questions, the rest looks good to me.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux