On 3/3/2022 6:19 AM, Patrick Steinhardt wrote: > On Wed, Mar 02, 2022 at 09:57:17AM -0500, Derrick Stolee wrote: >> On 3/2/2022 8:57 AM, Patrick Steinhardt wrote: >>> On Tue, Mar 01, 2022 at 10:25:46AM -0500, Derrick Stolee wrote: >>>> On 3/1/2022 9:53 AM, Patrick Steinhardt wrote: >> >>>>> Hum. I have re-verified, and this indeed seems to play out. So I must've >>>>> accidentally ran all my testing with the state generated without the >>>>> final patch which fixes the corruption. I do see lots of the following >>>>> warnings, but overall I can verify and write the commit-graph just fine: >>>>> >>>>> commit-graph generation for commit c80a42de8803e2d77818d0c82f88e748d7f9425f is 1623362063 < 1623362139 >>>> >>>> But I'm not able to generate these warnings from either version. I >>>> tried generating different levels of a split commit-graph, but >>>> could not reproduce it. If you have reproduction steps using current >>>> 'master' (or any released Git version) and the four patches here, >>>> then I would love to get a full understanding of your errors. >>>> >>>> Thanks, >>>> -Stolee >>> >>> I haven't yet been able to reproduce it with publicly available data, >>> but with the internal references I'm able to evoke the warnings >>> reliably. It only works when I have two repositories connected via >>> alternates, when generating the commit-graph in the linked-to repo >>> first, and then generating the commit-graph in the linking repo. >>> >>> The following recipe allows me to reproduce, but rely on private data: >>> >>> $ git --version >>> git version 2.35.1 >>> >>> # The pool repository is the one we're linked to from the fork. >>> $ cd "$pool" >>> $ rm -rf objects/info/commit-graph objects/info/commit-graph >>> $ git commit-graph write --split >>> >>> $ cd "$fork" >>> $ rm -rf objects/info/commit-graph objects/info/commit-graph >>> $ git commit-graph write --split >>> >>> $ git commit-graph verify --no-progress >>> $ echo $? >>> 0 >>> >>> # This is 715d08a9e51251ad8290b181b6ac3b9e1f9719d7 with your full v2 >>> # applied on top. >>> $ ~/Development/git/bin-wrappers/git --version >>> git version 2.35.1.358.g7ede1bea24 >>> >>> $ ~/Development/git/bin-wrappers/git commit-graph verify --no-progress >>> commit-graph generation for commit 06a91bac00ed11128becd48d5ae77eacd8f24c97 is 1623273624 < 1623273710 >>> commit-graph generation for commit 0ae91029f27238e8f8e109c6bb3907f864dda14f is 1622151146 < 1622151220 >>> commit-graph generation for commit 0d4582a33d8c8e3eb01adbf564f5e1deeb3b56a2 is 1631045222 < 1631045225 >>> commit-graph generation for commit 0daf8976439d7e0bb9710c5ee63b570580e0dc03 is 1620347739 < 1620347789 >>> commit-graph generation for commit 0e0ee8ffb3fa22cee7d28e21cbd6df26454932cf is 1623783297 < 1623783380 >>> commit-graph generation for commit 0f08ab3de6ec115ea8a956a1996cb9759e640e74 is 1621543278 < 1621543339 >>> commit-graph generation for commit 133ed0319b5a66ae0c2be76e5a887b880452b111 is 1620949864 < 1620949915 >>> commit-graph generation for commit 1341b3e6c63343ae94a8a473fa057126ddd4669a is 1637344364 < 1637344384 >>> commit-graph generation for commit 15bdfc501c2c9f23e9353bf6e6a5facd9c32a07a is 1623348103 < 1623348133 >>> ... >>> $ echo $? >>> 1 >>> >>> When generating commit-graphs with your patches applied the `verify` >>> step works alright. >>> >>> I've also by accident stumbled over the original error again: >>> >>> fatal: commit-graph requires overflow generation data but has none >>> >>> This time it's definitely not caused by generating commit-graphs with an >>> in-between state of your patch series because the data comes straight >>> from production with no changes to the commit-graphs performed by >>> myself. There we're running Git v2.33.1 with a couple of backported >>> patches (see [1]). While those patches cause us to make more use of the >>> commit-graph, none modify the way we generate them. >>> >>> Of note is that the commit-graph contains references to commits which >>> don't exist in the ODB anymore. >>> >>> Patrick >>> >>> [1]: https://gitlab.com/gitlab-org/gitlab-git/-/commits/pks-v2.33.1.gl3 >> >> Thank you for your diligence here, Patrick. I really appreciate the >> work you're putting in to verify the situation. >> >> Since our repro relies on private information, but is consistent, I >> wonder if we should take the patch below, which starts to ignore the >> older generation number v2 data and only writes freshly-computed >> numbers. >> >> Thanks, >> -Stolee > > Thanks. With your patch below the `fatal:` error is gone, but I'm still > seeing the same errors with regards to the commit-graph generations. This is disappointing and unexpected. Thanks for verifying. > So to summarize my findings: > > - This bug occurs when writing commit-graphs with v2.35.1, but > reading them with your patches. > > - This bug occurs when I have two repositories connected via an > alternates file. I haven't yet been able to reproduce it in a > single repository that is not connected to a separate ODB. This is an interesting distinction. One that I didn't think would matter, but I'll look into the code to see how that could affect things. > - This bug only occurs when I first generate the commit-graph in the > repository I'm borrowing objects from. > > - This bug only occurs when I write commit-graphs with `--split` in > both repositories. "Normal" commit-graphs don't have this issue, > and neither can I see it with `--split=replace` or mixed-type > commit-graphs. > > Beware, the following explanation is based on my very basic > understanding of the commit-graph code and thus more likely to be wrong > than right: > > With the old Git version, we've been mis-parsing the generation because > `read_generation_data` wasn't ever set. As a result it can happen that > the second split commit-graph we're generating computes its own > generation numbers from the wrong starting point because it uses the > mis-parsed generation numbers from the parent commit-graph. > > With your patches, we start to correctly account for overflows and would > thus end up with a different value for the generation depending on where > we parse the commit from: if we parse it from the first commit-graph it > would be correct because it's contains the "root" of the generation > numbers. But if we parse a commit from the second commit-graph we may > have a mismatch because the generation numbers in there may have been > derived from generation numbers mis-parsed from the first commit-graph. > And because these would be wrong in case there was an overflow it is > clear that the new corrected generation number may be wrong, as well. Hm. My expectation was that the older layers of the split commit-graph would have read_generation_data disabled (because the new Git version cannot read the GDAT chunk) and then the validate_mixed_generation_chain() method would remove read_generation_data from all of the graphs in the list. Combining this with your thoughts on cross-alternate split commit-graphs, this makes me think we should try this: --- >8 --- diff --git a/commit-graph.c b/commit-graph.c index fb2ced0bd6..74c6534f56 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -609,8 +609,6 @@ struct commit_graph *read_commit_graph_one(struct repository *r, if (!g) g = load_commit_graph_chain(r, odb); - validate_mixed_generation_chain(g); - return g; } @@ -668,7 +666,13 @@ static int prepare_commit_graph(struct repository *r) !r->objects->commit_graph && odb; odb = odb->next) prepare_commit_graph_one(r, odb); - return !!r->objects->commit_graph; + + if (r->objects->commit_graph) { + validate_mixed_generation_chain(r->objects->commit_graph); + return 1; + } + + return 0; } int generation_numbers_enabled(struct repository *r) --- >8 --- Notice that I'm moving the validate_mixed_generation_chain() call out of read_commit_graph_one() and into prepare_commit_graph(). To my understanding, this _should_ have an equivalent end state as the old code, but might be worth trying just as a quick check. I will continue investigating and try to reproduce with this additional constraint of working across an alternate. Thanks, -Stolee