Re: [PATCH 4/4] t5326: test propagating hashcache values

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 13, 2021 at 07:05:32PM -0700, Junio C Hamano wrote:

> Taylor Blau <me@xxxxxxxxxxxx> writes:
> 
> > Alas, there they are. They are basically no different than having the
> > name-hash for single pack bitmaps, it's just now we don't throw them
> > away when generating a MIDX bitmap from a state where the repository
> > already has a single-pack bitmap.
> 
> I actually wasn't expecting any CPU/time difference.

I was, for the same reason we saw an improvement there in ae4f07fbcc
(pack-bitmap: implement optional name_hash cache, 2013-12-21): without a
name-hash, we try a bunch of fruitless deltas before we find a decent
one.

> I hope that we are talking about the same name-hash, which is used
> to sort the blobs so that when pack-objects try to find a good delta
> base, the blobs from the same path will sit close to each other and
> hopefully fit in the pack window.

Yes, exactly. We spend less time finding the good ones if the likely
candidates are close together. We may _also_ find better ones overall,
depending on the number of candidates and the window size.

The bitmap perf tests (neither p5310 nor its new midx cousin p5326)
don't check the output size.

> The effect I was hoping to see by not discarding the information was
> that we find better delta base hence smaller deltas in the resulting
> packfiles.

If we add a size check like so[1]:

diff --git a/t/perf/lib-bitmap.sh b/t/perf/lib-bitmap.sh
index 63d3bc7cec..648cd5b13d 100644
--- a/t/perf/lib-bitmap.sh
+++ b/t/perf/lib-bitmap.sh
@@ -10,7 +10,11 @@ test_full_bitmap () {
 		{
 			echo HEAD &&
 			echo ^$have
-		} | git pack-objects --revs --stdout >/dev/null
+		} | git pack-objects --revs --stdout >tmp.pack
+	'
+
+	test_size 'fetch size' '
+		wc -c <tmp.pack
 	'
 
 	test_perf 'pack to file (bitmap)' '

then the results I get using linux.git are:

Test                       origin/tb/multi-pack-bitmaps   origin/tb/midx-write-propagate-namehash
-------------------------------------------------------------------------------------------------
5326.4: simulated fetch    2.32(7.16+0.21)                2.00(3.79+0.18) -13.8%
5326.5: fetch size         16.7M                          15.5M -7.1%

so you can see that we spent about half as much CPU (ignore the
wall-clock percentage; the interesting thing is the userspace time,
because my machine has 8 cores). But we also shaved off a bit from the
pack, so we really did manage to find better deltas, too.

I see that Taylor just posted a very similar response, and independently
did the exact same experiment I did. ;) I'll send this anyway, though,
as my particular run showed slightly different results.

-Peff

[1] The other thing you'd want (and I presume Taylor was using for his
    earlier timings) is:

diff --git a/t/perf/p5326-multi-pack-bitmaps.sh b/t/perf/p5326-multi-pack-bitmaps.sh
index 5845109ac7..a4ac7746a7 100755
--- a/t/perf/p5326-multi-pack-bitmaps.sh
+++ b/t/perf/p5326-multi-pack-bitmaps.sh
@@ -11,7 +11,7 @@ test_expect_success 'enable multi-pack index' '
 '
 
 test_perf 'setup multi-pack index' '
-	git repack -ad &&
+	git repack -adb &&
 	git multi-pack-index write --bitmap
 '
 

since otherwise there is no pack bitmap for the midx to pull the
name-hashes from.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux