"Garima Singh via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: Garima Singh <garima.singh@xxxxxxxxxxxxx> > > Add tests for the --changed-paths feature when writing > commit-graphs. It doesn't look however as if this test is actually testing the _Bloom filter_ functionality itself -- because this test looks like copy'n'paste of t/t5324-split-commit-graph.sh, just with `--changed-paths` added to the `git commit-graph write` invocation, and added checking via enhanced test-tool that there are Bloom filter chunks ("bloom_indexes" and "bloom_data"). Please correct me if I am wrong, but this looks like a simple sanity check for me. > > RFC Notes: > I plan to split this test across some of the earlier commits > as appropriate. About adding tests to earlier commits in this series: 1. Testing Bloom filter functionality: - creating Bloom filter and adding elements to it - testing Bloom filter functionality - for element in set the answer is "maybe" - for element not in set the answer is "no" or "maybe" - automatic resizing works (6 and 7 elements) - it works for different number of hash functions, and different number of bits per element (maybe?) 2. Testing Bloom filter for commit changeset: - it works for commit with no changes - it works for merge commit with no changes to first parent (`git merge --strategy=ours`) - with number of changes that require filter size change - with maximal number of changes, one changed file less, one changed file more - that for file deeper in hierarchy, path/to/file, all of changed directories (path/to/ and path/) are also added 3. Test writing and reading commit-graph with Bloom filters - that after writing Bloom filters with `--changed-paths` the data is present in commit-graph files - it works correctly with split commit-graph - it doesn't crash if confronted with unknown settings: hash version different than 1, different number of hash functions, different number of bits per element 4. Bloom filter specific `git commit-graph verify` parts - fail if Bloom filter chunks appear multiple times - fail if only one of BIDX or BDAT chunks are present - fail if BIDX is not monotonic, that is if size of Bloom filter for a commit is negative - fail if BDAT size does not agree with BIDX, being either too small, or too large - check if values of number of hash functions and number of bits per element added are sane 5. Using Bloom filters to speed up Git operations - test that with and without Bloom filters (or commit-graph) the following operations work the same: - git log -- <path/to/file> - git log -- <path/to/directory> - git log -- '*.c' # or other glob pattern - git log -- <file1> <file2> - git log --follow <file> - maybe also `git log --full-history -- <file>` - if possible, add performance tests, see `t/perf` > > Helped-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > Signed-off-by: Garima Singh <garima.singh@xxxxxxxxxxxxx> > --- > t/helper/test-read-graph.c | 4 + > t/t5325-commit-graph-bloom.sh | 255 ++++++++++++++++++++++++++++++++++ > 2 files changed, 259 insertions(+) > create mode 100755 t/t5325-commit-graph-bloom.sh > > diff --git a/t/helper/test-read-graph.c b/t/helper/test-read-graph.c > index d2884efe0a..aff597c7a3 100644 > --- a/t/helper/test-read-graph.c > +++ b/t/helper/test-read-graph.c > @@ -45,6 +45,10 @@ int cmd__read_graph(int argc, const char **argv) > printf(" commit_metadata"); > if (graph->chunk_extra_edges) > printf(" extra_edges"); > + if (graph->chunk_bloom_indexes) > + printf(" bloom_indexes"); > + if (graph->chunk_bloom_data) > + printf(" bloom_data"); All right, though it is very basic information. > printf("\n"); > > UNLEAK(graph); > diff --git a/t/t5325-commit-graph-bloom.sh b/t/t5325-commit-graph-bloom.sh > new file mode 100755 > index 0000000000..d7ef0e7fb3 > --- /dev/null > +++ b/t/t5325-commit-graph-bloom.sh > @@ -0,0 +1,255 @@ > +#!/bin/sh > + > +test_description='commit graph with bloom filters' > +. ./test-lib.sh > + > +test_expect_success 'setup repo' ' > + git init && > + git config core.commitGraph true && > + git config gc.writeCommitGraph false && > + infodir=".git/objects/info" && > + graphdir="$infodir/commit-graphs" && > + test_oid_init > +' > + > +graph_read_expect() { Style: space between function name and parentheses, i.e. +graph_read_expect () { > + OPTIONAL="" Not used anywhere. > + NUM_CHUNKS=5 > + if test ! -z $2 It might be good idea to add names to those parameters by setting some local variables to $1 and $2; or, alternatively add comment describing this function. > + then > + OPTIONAL=" $2" > + NUM_CHUNKS=$((NUM_CHUNKS + $(echo "$2" | wc -w))) > + fi > + cat >expect <<- EOF > + header: 43475048 1 1 $NUM_CHUNKS 0 > + num_commits: $1 > + chunks: oid_fanout oid_lookup commit_metadata bloom_indexes bloom_data > + EOF > + test-tool read-graph >output && > + test_cmp expect output > +} No comments below this point... Best, Jakub Narębski > + > +test_expect_success 'create commits and write commit-graph' ' > + for i in $(test_seq 3) > + do > + test_commit $i && > + git branch commits/$i || return 1 > + done && > + git commit-graph write --reachable --changed-paths && > + test_path_is_file $infodir/commit-graph && > + graph_read_expect 3 > +' > + > +graph_git_two_modes() { > + git -c core.commitGraph=true $1 >output > + git -c core.commitGraph=false $1 >expect > + test_cmp expect output > +} > + > +graph_git_behavior() { > + MSG=$1 > + BRANCH=$2 > + COMPARE=$3 > + test_expect_success "check normal git operations: $MSG" ' > + graph_git_two_modes "log --oneline $BRANCH" && > + graph_git_two_modes "log --topo-order $BRANCH" && > + graph_git_two_modes "log --graph $COMPARE..$BRANCH" && > + graph_git_two_modes "branch -vv" && > + graph_git_two_modes "merge-base -a $BRANCH $COMPARE" > + ' > +} > + > +graph_git_behavior 'graph exists' commits/3 commits/1 > + > +verify_chain_files_exist() { > + for hash in $(cat $1/commit-graph-chain) > + do > + test_path_is_file $1/graph-$hash.graph || return 1 > + done > +} > + > +test_expect_success 'add more commits, and write a new base graph' ' > + git reset --hard commits/1 && > + for i in $(test_seq 4 5) > + do > + test_commit $i && > + git branch commits/$i || return 1 > + done && > + git reset --hard commits/2 && > + for i in $(test_seq 6 10) > + do > + test_commit $i && > + git branch commits/$i || return 1 > + done && > + git reset --hard commits/2 && > + git merge commits/4 && > + git branch merge/1 && > + git reset --hard commits/4 && > + git merge commits/6 && > + git branch merge/2 && > + git commit-graph write --reachable --changed-paths && > + graph_read_expect 12 > +' > + > +test_expect_success 'fork and fail to base a chain on a commit-graph file' ' > + test_when_finished rm -rf fork && > + git clone . fork && > + ( > + cd fork && > + rm .git/objects/info/commit-graph && > + echo "$(pwd)/../.git/objects" >.git/objects/info/alternates && > + test_commit new-commit && > + git commit-graph write --reachable --split --changed-paths && > + test_path_is_file $graphdir/commit-graph-chain && > + test_line_count = 1 $graphdir/commit-graph-chain && > + verify_chain_files_exist $graphdir > + ) > +' > + > +test_expect_success 'add three more commits, write a tip graph' ' > + git reset --hard commits/3 && > + git merge merge/1 && > + git merge commits/5 && > + git merge merge/2 && > + git branch merge/3 && > + git commit-graph write --reachable --split --changed-paths && > + test_path_is_missing $infodir/commit-graph && > + test_path_is_file $graphdir/commit-graph-chain && > + ls $graphdir/graph-*.graph >graph-files && > + test_line_count = 2 graph-files && > + verify_chain_files_exist $graphdir > +' > + > +graph_git_behavior 'split commit-graph: merge 3 vs 2' merge/3 merge/2 > + > +test_expect_success 'add one commit, write a tip graph' ' > + test_commit 11 && > + git branch commits/11 && > + git commit-graph write --reachable --split --changed-paths && > + test_path_is_missing $infodir/commit-graph && > + test_path_is_file $graphdir/commit-graph-chain && > + ls $graphdir/graph-*.graph >graph-files && > + test_line_count = 3 graph-files && > + verify_chain_files_exist $graphdir > +' > + > +graph_git_behavior 'three-layer commit-graph: commit 11 vs 6' commits/11 commits/6 > + > +test_expect_success 'add one commit, write a merged graph' ' > + test_commit 12 && > + git branch commits/12 && > + git commit-graph write --reachable --split --changed-paths && > + test_path_is_file $graphdir/commit-graph-chain && > + test_line_count = 2 $graphdir/commit-graph-chain && > + ls $graphdir/graph-*.graph >graph-files && > + test_line_count = 2 graph-files && > + verify_chain_files_exist $graphdir > +' > + > +graph_git_behavior 'merged commit-graph: commit 12 vs 6' commits/12 commits/6 > + > +test_expect_success 'create fork and chain across alternate' ' > + git clone . fork && > + ( > + cd fork && > + git config core.commitGraph true && > + rm -rf $graphdir && > + echo "$(pwd)/../.git/objects" >.git/objects/info/alternates && > + test_commit 13 && > + git branch commits/13 && > + git commit-graph write --reachable --split --changed-paths && > + test_path_is_file $graphdir/commit-graph-chain && > + test_line_count = 3 $graphdir/commit-graph-chain && > + ls $graphdir/graph-*.graph >graph-files && > + test_line_count = 1 graph-files && > + git -c core.commitGraph=true rev-list HEAD >expect && > + git -c core.commitGraph=false rev-list HEAD >actual && > + test_cmp expect actual && > + test_commit 14 && > + git commit-graph write --reachable --split --changed-paths --object-dir=.git/objects/ && > + test_line_count = 3 $graphdir/commit-graph-chain && > + ls $graphdir/graph-*.graph >graph-files && > + test_line_count = 1 graph-files > + ) > +' > + > +graph_git_behavior 'alternate: commit 13 vs 6' commits/13 commits/6 > + > +test_expect_success 'test merge stragety constants' ' > + git clone . merge-2 && > + ( > + cd merge-2 && > + git config core.commitGraph true && > + test_line_count = 2 $graphdir/commit-graph-chain && > + test_commit 14 && > + git commit-graph write --reachable --split --changed-paths --size-multiple=2 && > + test_line_count = 3 $graphdir/commit-graph-chain > + > + ) && > + git clone . merge-10 && > + ( > + cd merge-10 && > + git config core.commitGraph true && > + test_line_count = 2 $graphdir/commit-graph-chain && > + test_commit 14 && > + git commit-graph write --reachable --split --changed-paths --size-multiple=10 && > + test_line_count = 1 $graphdir/commit-graph-chain && > + ls $graphdir/graph-*.graph >graph-files && > + test_line_count = 1 graph-files > + ) && > + git clone . merge-10-expire && > + ( > + cd merge-10-expire && > + git config core.commitGraph true && > + test_line_count = 2 $graphdir/commit-graph-chain && > + test_commit 15 && > + git commit-graph write --reachable --split --changed-paths --size-multiple=10 --expire-time=1980-01-01 && > + test_line_count = 1 $graphdir/commit-graph-chain && > + ls $graphdir/graph-*.graph >graph-files && > + test_line_count = 3 graph-files > + ) && > + git clone --no-hardlinks . max-commits && > + ( > + cd max-commits && > + git config core.commitGraph true && > + test_line_count = 2 $graphdir/commit-graph-chain && > + test_commit 16 && > + test_commit 17 && > + git commit-graph write --reachable --split --changed-paths --max-commits=1 && > + test_line_count = 1 $graphdir/commit-graph-chain && > + ls $graphdir/graph-*.graph >graph-files && > + test_line_count = 1 graph-files > + ) > +' > + > +test_expect_success 'remove commit-graph-chain file after flattening' ' > + git clone . flatten && > + ( > + cd flatten && > + test_line_count = 2 $graphdir/commit-graph-chain && > + git commit-graph write --reachable && > + test_path_is_missing $graphdir/commit-graph-chain && > + ls $graphdir >graph-files && > + test_must_be_empty graph-files > + ) > +' > + > +graph_git_behavior 'graph exists' merge/octopus commits/12 > + > +test_expect_success 'split across alternate where alternate is not split' ' > + git commit-graph write --reachable && > + test_path_is_file .git/objects/info/commit-graph && > + cp .git/objects/info/commit-graph . && > + git clone --no-hardlinks . alt-split && > + ( > + cd alt-split && > + rm -f .git/objects/info/commit-graph && > + echo "$(pwd)"/../.git/objects >.git/objects/info/alternates && > + test_commit 18 && > + git commit-graph write --reachable --split --changed-paths && > + test_line_count = 1 $graphdir/commit-graph-chain > + ) && > + test_cmp commit-graph .git/objects/info/commit-graph > +' > + > +test_done