Here is v4 of a series that was previously only one patch. Thanks to the efforts of Thomas, Peff, and Taylor, we have a full understanding of what went wrong here. Details are in the patches themselves, but generally when writing a commit-graph chain we rely on the commit-graph parsing to know which commits are already in the lower layers of the chain. When 'core.commitGraph' is disabled, then we don't do that parsing, and then we add all reachable commits to the top layer and merge down. This causes us to hit the previous die() statement. This fixes the problem by first handling any duplicates we see during a merge (this is important for handling any other data out there in this situation) and also to disable commit-graph writes when 'core.commitGraph' is disabled. Thanks, -Stolee Derrick Stolee (2): commit-graph: ignore duplicates when merging layers commit-graph: don't write commit-graph when disabled Documentation/git-commit-graph.txt | 4 +++- commit-graph.c | 21 ++++++++++++++++++--- t/t5324-split-commit-graph.sh | 13 +++++++++++++ 3 files changed, 34 insertions(+), 4 deletions(-) base-commit: d98273ba77e1ab9ec755576bc86c716a97bf59d7 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-747%2Fderrickstolee%2Fcommit-graph-dup-commits-v4 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-747/derrickstolee/commit-graph-dup-commits-v4 Pull-Request: https://github.com/gitgitgadget/git/pull/747 Range-diff vs v3: 1: 9e760f07ac ! 1: 9279aea3ef commit-graph: ignore duplicates when merging layers @@ Commit message pointers. This allows us to get the end result we want without extra memory costs and minimal CPU time. - Since the root cause for producing commit-graph layers with these - duplicate commits is currently unknown, it is difficult to create a test - for this scenario. For now, we must rely on testing the example data - graciously provided in [1]. My local test successfully merged layers, - and 'git commit-graph verify' passed. + The root cause is due to disabling core.commitGraph, which prevents + parsing commits from the lower layers during a 'git commit-graph write + --split' command. Since we use the 'graph_pos' value to determine + whether a commit is in a lower layer, we never discover that those + commits are already in the commit-graph chain and add them to the top + layer. This layer is then merged down, creating duplicates. + + The test added in t5324-split-commit-graph.sh fails without this change. + However, we still have not completely removed the need for this + duplicate check. That will come in a follow-up change. Reported-by: Thomas Braun <thomas.braun@xxxxxxxxxxxxxxxxxxx> Helped-by: Taylor Blau <me@xxxxxxxxxxxx> @@ commit-graph.c: static void sort_and_scan_merged_commits(struct write_commit_gra stop_progress(&ctx->progress); } + + ## t/t5324-split-commit-graph.sh ## +@@ t/t5324-split-commit-graph.sh: test_expect_success '--split=replace with partial Bloom data' ' + verify_chain_files_exist $graphdir + ' + ++test_expect_success 'prevent regression for duplicate commits across layers' ' ++ git init dup && ++ git -C dup config core.commitGraph false && ++ git -C dup commit --allow-empty -m one && ++ git -C dup commit-graph write --split=no-merge --reachable && ++ git -C dup commit --allow-empty -m two && ++ git -C dup commit-graph write --split=no-merge --reachable && ++ git -C dup commit --allow-empty -m three && ++ git -C dup commit-graph write --split --reachable && ++ git -C dup commit-graph verify ++' ++ + test_done -: ---------- > 2: 4439e8ae8f commit-graph: don't write commit-graph when disabled -- gitgitgadget