[PATCH 04/13] p5313: add performance tests for --path-walk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Derrick Stolee <stolee@xxxxxxxxx>

The previous change added a --path-walk option to 'git pack-objects'.
Create a performance test that demonstrates the time and space benefits
of the feature.

In order to get an appropriate comparison, we need to avoid reusing
deltas and recompute them from scratch.

Compare the creation of a thin pack representing a small push and the
creation of a relatively large non-thin pack.

Running on my copy of the Git repository results in this data (removing
the repack tests for --name-hash-version):

Test                                                     this tree
------------------------------------------------------------------------
5313.2: thin pack with --name-hash-version=1             0.02(0.01+0.01)
5313.3: thin pack size with --name-hash-version=1                   1.6K
5313.4: big pack with --name-hash-version=1              2.55(4.20+0.26)
5313.5: big pack size with --name-hash-version=1                   16.4M
5313.6: shallow fetch pack with --name-hash-version=1    1.24(2.03+0.08)
5313.7: shallow pack size with --name-hash-version=1               12.2M
5313.10: thin pack with --name-hash-version=2            0.03(0.01+0.01)
5313.11: thin pack size with --name-hash-version=2                  1.6K
5313.12: big pack with --name-hash-version=2             1.91(3.23+0.20)
5313.13: big pack size with --name-hash-version=2                  16.4M
5313.14: shallow fetch pack with --name-hash-version=2   1.06(1.57+0.10)
5313.15: shallow pack size with --name-hash-version=2              12.5M
5313.18: thin pack with --path-walk                      0.03(0.01+0.01)
5313.19: thin pack size with --path-walk                            1.6K
5313.20: big pack with --path-walk                       2.05(3.24+0.27)
5313.21: big pack size with --path-walk                            16.3M
5313.22: shallow fetch pack with --path-walk             1.08(1.66+0.07)
5313.23: shallow pack size with --path-walk                        12.4M

This can be reformatted as follows:

Pack Type            Hash v1   Hash v2     Path Walk
---------------------------------------------------
thin pack    (time)    0.02s      0.03s      0.03s
             (size)    1.6K       1.6K       1.6K
big pack     (time)    2.55s      1.91s      2.05s
             (size)   16.4M      16.4M      16.3M
shallow pack (time)    1.24s      1.06s      1.08s
             (size)   12.2M      12.5M      12.4M

Note that the timing is slower because there is no threading in the
--path-walk case (yet). Also, the shallow pack cases are really not
using the --path-walk logic right now because it is disabled until some
additions are made to the path walk API.

The cases where the --path-walk option really shines is when the default
name-hash is overwhelmed with collisions. An open source example can be
found in the microsoft/fluentui repo [1] at a certain commit [2].

[1] https://github.com/microsoft/fluentui
[2] e70848ebac1cd720875bccaa3026f4a9ed700e08

Running the tests on this repo results in the following comparison table:

Pack Type            Hash v1    Hash v2    Path Walk
---------------------------------------------------
thin pack    (time)    0.36s      0.12s      0.08s
             (size)    1.2M      22.0K      18.4K
big pack     (time)    2.00s      2.90s      2.21s
             (size)   20.4M      25.9M      19.5M
shallow pack (time)    1.41s      1.80s      1.65s
             (size)   34.4M      33.7M      33.6M

Notice in particular that in the small thin pack, the time performance
has improved from 0.36s for --name-hash-version=1 to 0.08s and this is
likely due to the improved size of the resulting pack: 18.4K instead of
1.2M.  The relatively new --name-hash-version=2 is competitive with
--path-walk (0.12s and 22.0K) but not quite as successful.

Finally, running this on a copy of the Linux kernel repository results
in these data points:

Pack Type            Hash v1    Hash v2    Path Walk
---------------------------------------------------
thin pack    (time)    0.03s      0.13s      0.03s
             (size)    4.6K       4.6K       4.6K
big pack     (time)   15.29s     12.32s     13.92s
             (size)  201.1M     159.1M     158.5M
shallow pack (time)   10.88s     22.93s     22.74s
             (size)  269.2M     273.8M     267.7M

Signed-off-by: Derrick Stolee <stolee@xxxxxxxxx>
---
 t/perf/p5313-pack-objects.sh | 37 ++++++++++++++++++++++--------------
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/t/perf/p5313-pack-objects.sh b/t/perf/p5313-pack-objects.sh
index be5229a0ecd..cd6dd3abb71 100755
--- a/t/perf/p5313-pack-objects.sh
+++ b/t/perf/p5313-pack-objects.sh
@@ -25,46 +25,55 @@ test_expect_success 'create rev input' '
 	EOF
 '
 
-for version in 1 2
-do
-	export version
+test_all_with_args () {
+	parameter=$1
+	export parameter
 
-	test_perf "thin pack with version $version" '
+	test_perf "thin pack with $parameter" '
 		git pack-objects --thin --stdout --revs --sparse \
-			--name-hash-version=$version <in-thin >out
+			$parameter <in-thin >out
 	'
 
-	test_size "thin pack size with version $version" '
+	test_size "thin pack size with $parameter" '
 		test_file_size out
 	'
 
-	test_perf "big pack with version $version" '
+	test_perf "big pack with $parameter" '
 		git pack-objects --stdout --revs --sparse \
-			--name-hash-version=$version <in-big >out
+			$parameter <in-big >out
 	'
 
-	test_size "big pack size with version $version" '
+	test_size "big pack size with $parameter" '
 		test_file_size out
 	'
 
-	test_perf "shallow fetch pack with version $version" '
+	test_perf "shallow fetch pack with $parameter" '
 		git pack-objects --stdout --revs --sparse --shallow \
-			--name-hash-version=$version <in-shallow >out
+			$parameter <in-shallow >out
 	'
 
-	test_size "shallow pack size with version $version" '
+	test_size "shallow pack size with $parameter" '
 		test_file_size out
 	'
+}
 
-	test_perf "repack with version $version" '
+for version in 1 2
+do
+	export version
+
+	test_all_with_args --name-hash-version=$version
+
+	test_perf "repack with --name-hash-version=$version" '
 		git repack -adf --name-hash-version=$version
 	'
 
-	test_size "repack size with version $version" '
+	test_size "repack size with --name-hash-version=$version" '
 		gitdir=$(git rev-parse --git-dir) &&
 		pack=$(ls $gitdir/objects/pack/pack-*.pack) &&
 		test_file_size "$pack"
 	'
 done
 
+test_all_with_args --path-walk
+
 test_done
-- 
gitgitgadget





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux