[PATCH v2 0/3] reftable/stack: use geometric table compaction

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello again,

This is the second version my patch series that refactors the reftable
compaction strategy to instead follow a geometric sequence. Changes compared
to v1:

 * Added GIT_TEST_REFTABLE_NO_AUTOCOMPACTION environment variable to disable
   reftable compaction when testing.
 * Refactored worktree tests in t0610-reftable-basics.sh to properly assert
   git-pack-refs(1) works as expected.
 * Added test to validate that alternating table sizes are compacted.
 * Added benchmark to compare compaction strategies.
 * Moved change that made compaction segment end inclusive to its own
   commit.
 * Added additional explanation in commits and comments and fixed typos.

Thanks for taking a look!

Justin

Justin Tobler (3):
  reftable/stack: add env to disable autocompaction
  reftable/stack: use geometric table compaction
  reftable/segment: make segment end inclusive

 reftable/stack.c           | 113 ++++++++++++++++---------------------
 reftable/stack.h           |   3 -
 reftable/stack_test.c      |  66 +++++-----------------
 reftable/system.h          |   1 +
 t/t0610-reftable-basics.sh |  43 +++++++++-----
 5 files changed, 94 insertions(+), 132 deletions(-)


base-commit: 3bd955d26919e149552f34aacf8a4e6368c26cec
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1683%2Fjltobler%2Fjt%2Freftable-geometric-compaction-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1683/jltobler/jt/reftable-geometric-compaction-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1683

Range-diff vs v1:

 -:  ----------- > 1:  cb6b152e5c8 reftable/stack: add env to disable autocompaction
 1:  7a518853a10 ! 2:  def70084523 reftable/stack: use geometric table compaction
     @@ Commit message
          occurring until a separate operation produces a table matching the
          previous table log value.
      
     -    To avoid unbounded growth of the table list, walk through each table and
     -    evaluate if it needs to be included in the compaction segment to restore
     -    a geometric sequence.
     +    Instead, to avoid unbounded growth of the table list, the compaction
     +    strategy is updated to ensure tables follow a geometric sequence after
     +    each operation. This is done by walking the table list in reverse index
     +    order to identify the compaction segment start and end. The compaction
     +    segment end is found by identifying the first table which has a
     +    preceding table size less than twice the current table. Next, the
     +    compaction segment start is found iterating through the remaining tables
     +    in the list checking if the previous table size is less than twice the
     +    cumulative of tables from the segment end. This ensures the correct
     +    segment start is found and that the newly compacted table does not
     +    violate the geometric sequence.
     +
     +    When creating 10 thousand references, the new strategy has no
     +    performance impact:
     +
     +    Benchmark 1: update-ref: create refs sequentially (revision = HEAD~)
     +      Time (mean ± σ):     26.516 s ±  0.047 s    [User: 17.864 s, System: 8.491 s]
     +      Range (min … max):   26.447 s … 26.569 s    10 runs
     +
     +    Benchmark 2: update-ref: create refs sequentially (revision = HEAD)
     +      Time (mean ± σ):     26.417 s ±  0.028 s    [User: 17.738 s, System: 8.500 s]
     +      Range (min … max):   26.366 s … 26.444 s    10 runs
     +
     +    Summary
     +      update-ref: create refs sequentially (revision = HEAD) ran
     +        1.00 ± 0.00 times faster than update-ref: create refs sequentially (revision = HEAD~)
      
          Some tests in `t0610-reftable-basics.sh` assert the on-disk state of
          tables and are therefore updated to specify the correct new table count.
     @@ reftable/stack.c: static int segment_size(struct segment *s)
      +	 * until a valid segment end is found. If the preceding table is smaller
      +	 * than the current table multiplied by the geometric factor (2), the
      +	 * current table is set as the compaction segment end.
     ++	 *
     ++	 * Tables after the ending point are not added to the byte count because
     ++	 * they are already valid members of the geometric sequence. Due to the
     ++	 * properties of a geometric sequence, it is not possible for the sum of
     ++	 * these tables to exceed the value of the ending point table.
      +	 */
      +	for (i = n - 1; i > 0; i--) {
      +		if (sizes[i - 1] < sizes[i] * 2) {
     -+			seg.end = i;
     ++			seg.end = i + 1;
      +			bytes = sizes[i];
       			break;
      +		}
     @@ reftable/stack.c: static int segment_size(struct segment *s)
      +
      +	/*
      +	 * Find the starting table of the compaction segment by iterating
     -+	 * through the remaing tables and keeping track of the accumulated size
     -+	 * of all tables seen from the segment end table.
     ++	 * through the remaining tables and keeping track of the accumulated
     ++	 * size of all tables seen from the segment end table.
      +	 *
      +	 * Note that we keep iterating even after we have found the first
     -+	 * first starting point. This is because there may be tables in the
     -+	 * stack preceding that first starting point which violate the geometric
     ++	 * starting point. This is because there may be tables in the stack
     ++	 * preceding that first starting point which violate the geometric
      +	 * sequence.
      +	 */
      +	for (; i > 0; i--) {
     @@ reftable/stack.c: static int segment_size(struct segment *s)
       }
       
       static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
     -@@ reftable/stack.c: int reftable_stack_auto_compact(struct reftable_stack *st)
     - 		suggest_compaction_segment(sizes, st->merged->stack_len);
     - 	reftable_free(sizes);
     - 	if (segment_size(&seg) > 0)
     --		return stack_compact_range_stats(st, seg.start, seg.end - 1,
     -+		return stack_compact_range_stats(st, seg.start, seg.end,
     - 						 NULL);
     - 
     - 	return 0;
      
       ## reftable/stack.h ##
      @@ reftable/stack.h: int read_lines(const char *filename, char ***lines);
     @@ reftable/stack_test.c: static void test_reftable_stack_hash_id(void)
      -	EXPECT(min.start == 2);
      -	EXPECT(min.end == 7);
      +	EXPECT(min.start == 1);
     -+	EXPECT(min.end == 9);
     ++	EXPECT(min.end == 10);
       }
       
       static void test_suggest_compaction_segment_nothing(void)
     @@ t/t0610-reftable-basics.sh: test_expect_success 'ref transaction: writes cause a
       
       	test_commit -C repo --no-tag B &&
       	test_line_count = 1 repo/.git/reftable/tables.list
     + '
     + 
     ++test_expect_success 'ref transaction: alternating table sizes are compacted' '
     ++	test_when_finished "rm -rf repo" &&
     ++	git init repo &&
     ++	test_commit -C repo A &&
     ++	for i in $(test_seq 20)
     ++	do
     ++		git -C repo branch -f foo &&
     ++		git -C repo branch -d foo || return 1
     ++	done &&
     ++	test_line_count = 2 repo/.git/reftable/tables.list
     ++'
     ++
     + check_fsync_events () {
     + 	local trace="$1" &&
     + 	shift &&
      @@ t/t0610-reftable-basics.sh: test_expect_success 'ref transaction: writes are synced' '
       		git -C repo -c core.fsync=reference \
       		-c core.fsyncMethod=fsync update-ref refs/heads/branch HEAD &&
     @@ t/t0610-reftable-basics.sh: do
       		git -C repo pack-refs &&
       		test_expect_perms "-rw-rw-r--" repo/.git/reftable/tables.list &&
      @@ t/t0610-reftable-basics.sh: test_expect_success 'worktree: pack-refs in main repo packs main refs' '
     + 	test_when_finished "rm -rf repo worktree" &&
     + 	git init repo &&
       	test_commit -C repo A &&
     - 	git -C repo worktree add ../worktree &&
     +-	git -C repo worktree add ../worktree &&
     ++	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C repo worktree add ../worktree &&
     ++	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/worktree/per-worktree HEAD &&
       
      -	test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
      -	test_line_count = 4 repo/.git/reftable/tables.list &&
     -+	test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
     -+	test_line_count = 1 repo/.git/reftable/tables.list &&
     ++	test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
     ++	test_line_count = 3 repo/.git/reftable/tables.list &&
       	git -C repo pack-refs &&
      -	test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
     -+	test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
     ++	test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
       	test_line_count = 1 repo/.git/reftable/tables.list
       '
       
      @@ t/t0610-reftable-basics.sh: test_expect_success 'worktree: pack-refs in worktree packs worktree refs' '
     + 	test_when_finished "rm -rf repo worktree" &&
     + 	git init repo &&
       	test_commit -C repo A &&
     - 	git -C repo worktree add ../worktree &&
     +-	git -C repo worktree add ../worktree &&
     ++	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C repo worktree add ../worktree &&
     ++	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/worktree/per-worktree HEAD &&
       
      -	test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
      -	test_line_count = 4 repo/.git/reftable/tables.list &&
     -+	test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
     -+	test_line_count = 1 repo/.git/reftable/tables.list &&
     ++	test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
     ++	test_line_count = 3 repo/.git/reftable/tables.list &&
       	git -C worktree pack-refs &&
       	test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
      -	test_line_count = 4 repo/.git/reftable/tables.list
     -+	test_line_count = 1 repo/.git/reftable/tables.list
     ++	test_line_count = 3 repo/.git/reftable/tables.list
       '
       
       test_expect_success 'worktree: creating shared ref updates main stack' '
     + 	test_when_finished "rm -rf repo worktree" &&
     + 	git init repo &&
     + 	test_commit -C repo A &&
     ++	test_commit -C repo B &&
     + 
     + 	git -C repo worktree add ../worktree &&
     + 	git -C repo pack-refs &&
      @@ t/t0610-reftable-basics.sh: test_expect_success 'worktree: creating shared ref updates main stack' '
     + 	test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
     + 	test_line_count = 1 repo/.git/reftable/tables.list &&
       
     - 	git -C worktree update-ref refs/heads/shared HEAD &&
     +-	git -C worktree update-ref refs/heads/shared HEAD &&
     ++	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/heads/shared HEAD &&
       	test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
     --	test_line_count = 2 repo/.git/reftable/tables.list
     -+	test_line_count = 1 repo/.git/reftable/tables.list
     + 	test_line_count = 2 repo/.git/reftable/tables.list
       '
     - 
     - test_expect_success 'worktree: creating per-worktree ref updates worktree stack' '
 -:  ----------- > 3:  a23e3fc6972 reftable/segment: make segment end inclusive

-- 
gitgitgadget




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux