Derrick Stolee <stolee@xxxxxxxxx> writes: > On 6/25/2019 3:51 AM, Jakub Narebski wrote: >> Jakub Narebski <jnareb@xxxxxxxxx> writes: >>> Derrick Stolee <stolee@xxxxxxxxx> writes: [...] >>> O.K., so the "generation number v2 (legacy)" would be incremental and >>> backward-compatibile in use (though not in generation and validation). [...] >>> Do you have benchmark for this "monotonically offset corrected commit >>> date" generation number in https://github.com/derrickstolee/git/commits/reach-perf >>> and https://github.com/derrickstolee/gen-test ? >> >> I guess this will have to wait... > > I have not had time to revisit this topic and re-run performance > numbers, sorry. I have created pull requests against `reach-perf` branch of derrickstolee/git repository[1], and companion pull request against gen-test repository[2] with proposed prototype of backward-compatible corrected commit date (with monotonic offsets). Could you please run the tests for this generation number v5? I was not able to do so. It was my first time trying to compile Git on MS Windows, and while there were no problems compiling `master` (well, except for compilation taking a long time), I was unable to do it for `reach-perf` branch because of independent of change compilation errors. It would be good to test also the legacy mode, i.e. old Git (using generation number v0, i.e. topological levels (shortest path from node to sink, plus one) with all backward-compatible generation numbers, or at least generation number v5. Do I understand it correctly that for the time being because of backward-compatibility concerns instead of incrementing the version number there would be used some kind of "metadata" chunk (at least until version of Git that fails hard, instead of turning off commit-graph feature softly, on unknown version numbers dies out). Is there some proposal how such chunk should look like? [1]: https://github.com/derrickstolee/git/pull/10 [2]: https://github.com/derrickstolee/gen-test/pull/1 --- >8 ---- >8 ---- >8 ---- >8 ---- >8 ---- >8 ---- >8 ---- >8 ---- From: Jakub Narębski <jnareb@xxxxxxxxx> Subject: [PATCH] commit-graph: generation v5 (backward compatible date ceiling) This generation number option is backward-compatible (with reachability index v0) version of the generation number v3, that is the corrected commit date. It takes the simple approach of faking "commit dates" being in the right order by adding an offset to the commit date (which is stored in the generation number column), so the modified commit date is always at least one more than the maximum modified commit date of all ancestors. Additionally the offset itself is adjusted in such way that the offset of a commit is always at least one more than the offsets of all ancestors. Two following conditions are satisfied on the offset of a commit C for every parent P: 1. committer_date(C) + offset(C) > committer_date(P) + offset(P) 2. offset(C) > offset(P) This reachability index is locally computable, which means it is compatible with incremental commit graph feature. It has the same problem as v3, namely that we have a "maximum clock skew" based on the maximum generation number we can store, only more so. Note: it includes incidental whitespace cleanups for v3 code. Signed-off-by: Jakub Narębski <jnareb@xxxxxxxxx> --- Note that the diff looks a bit strange in the part adding the code for the new compute_generation_numbers_5() subroutine, because it got entangled in whitespace cleanup (which I shouldn't do, I'm sorry). Relevant part of README.md from `gen-test` repository: -------- **V5: Corrected Commit Date with Strictly Monotonic Offset.** For a commit C, let its _offset commit date_ (denoted by odate(C)) be a commit date plus some offset, i.e. odate(C) = date(C) + offset(C), such that: 1. Offset commit date is greater than the maximum of the commit date of C and the offset commit dates of its parents. If odate(A) < odate(B), then A cannot reach B. 2. Offset of a commit is one more than the maximum offset of a parent, or more If offset(A) < offset(B), then A cannot reach B. This is backward-compatible version of V3: Corrected Commit Date. ### Comparing Reachability Index Versions Viability [...] | Index | Compatible? | Immutable? | Local? | |---------------------------|-------------|------------|--------| | Minimum Generation Number | Yes | Yes | Yes | | (Epoch, Date) pairs | Yes | Yes | Yes | | Maximum Generation Number | Yes | No | No | | Corrected Commit Date | No | Yes | Yes | | FELINE index | Yes | No | No | | Offset Commit Date *NEW* | Yes | Yes | Yes | _Note:_ The corrected commit date uses the generation number column to store an offset of "how much do I need to add to my commit date to get my corrected commit date?" The values stored in that column are then not backwards-compatible. _Note:_ The corrected commit date with strictly monotonic offset also uses the generation number column to store the date offset, but the offset alone can be used as generation number (as reachability index) itself. commit-graph.c | 126 +++++++++++++++++++++++++++++++++++++------------ 1 file changed, 95 insertions(+), 31 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index 91863d4895..633b4b24f8 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -99,10 +99,9 @@ int compare_generations(struct generation *a, struct generation *b) return 1; return 0; - case 3: - ta = a->date + a->value1; - tb = b->date + b->value1; - + case 3: /* V3: Corrected Commit Date */ + case 5: /* V5: Strictly Monotonic Corrected Commit Date */ + /* handle special cases, i.e. commits outside commit graph */ if (a->value1 == GENERATION_NUMBER_INFINITY) { if (b->value1 == GENERATION_NUMBER_INFINITY) return 0; @@ -112,6 +111,10 @@ int compare_generations(struct generation *a, struct generation *b) return -1; } + /* corrected commit date = date + offset (correction) */ + ta = a->date + a->value1; + tb = b->date + b->value1; + if (ta < tb) return -1; if (ta > tb) @@ -162,6 +165,7 @@ void get_generation_version_from_commit(const struct commit *c, case 1: case 3: + case 5: gen->value1 = c->generation; gen->date = c->date; break; @@ -212,9 +216,10 @@ void set_generation_below_commit(const struct commit *c, struct generation *g) break; case 3: + case 5: /* ??? */ if (g->value1 + g->date >= gc.value1 + gc.date) { g->value1 = 0; - g->date = gc.value1 + gc.date; + g->date = gc.value1 + gc.date; } break; @@ -363,7 +368,7 @@ struct commit_graph *load_commit_graph_one(const char *graph_file) else graph->chunk_large_edges = data + chunk_offset; break; - + case GRAPH_CHUNKID_FELINE: if (graph->chunk_feline_gen) chunk_repeated = 1; @@ -971,27 +976,27 @@ static void compute_generation_numbers_3(struct packed_commit_list *commits) int i; struct commit_list *list = NULL; - for (i = 0; i < commits->nr; i++) { - if (commits->list[i]->generation != GENERATION_NUMBER_INFINITY && - commits->list[i]->generation != GENERATION_NUMBER_ZERO) - continue; + for (i = 0; i < commits->nr; i++) { + if (commits->list[i]->generation != GENERATION_NUMBER_INFINITY && + commits->list[i]->generation != GENERATION_NUMBER_ZERO) + continue; - commit_list_insert(commits->list[i], &list); + commit_list_insert(commits->list[i], &list); - while (list) { - struct commit *current = list->item; - struct commit_list *parent; - int all_parents_computed = 1; + while (list) { + struct commit *current = list->item; + struct commit_list *parent; + int all_parents_computed = 1; - timestamp_t max_timestamp = current->date; + timestamp_t max_timestamp = current->date; - for (parent = current->parents; parent; parent = parent->next) { - if (parent->item->generation == GENERATION_NUMBER_INFINITY || - parent->item->generation == GENERATION_NUMBER_ZERO) { - all_parents_computed = 0; - commit_list_insert(parent->item, &list); - break; - } else { + for (parent = current->parents; parent; parent = parent->next) { + if (parent->item->generation == GENERATION_NUMBER_INFINITY || + parent->item->generation == GENERATION_NUMBER_ZERO) { + all_parents_computed = 0; + commit_list_insert(parent->item, &list); + break; + } else { struct generation pg; timestamp_t pt; get_generation_version_from_commit(parent->item, 3, &pg); @@ -1001,19 +1006,75 @@ static void compute_generation_numbers_3(struct packed_commit_list *commits) if (pt > max_timestamp) max_timestamp = pt + 1; } - } + } + + if (all_parents_computed) { + current->generation = (uint32_t)(max_timestamp - current->date) + 1; + pop_commit(&list); + + if (current->generation > GENERATION_NUMBER_MAX) + die(_("generation number gap is too high!")); + } + } + } +} + +static void compute_generation_numbers_5(struct packed_commit_list *commits) +{ + int i; + struct commit_list *list = NULL; + + for (i = 0; i < commits->nr; i++) { + /* skip already computed generation numbers */ + if (commits->list[i]->generation != GENERATION_NUMBER_INFINITY && + commits->list[i]->generation != GENERATION_NUMBER_ZERO) + continue; + + commit_list_insert(commits->list[i], &list); + + while (list) { + struct commit *current = list->item; + struct commit_list *parent; + int all_parents_computed = 1; + + timestamp_t max_timestamp = current->date; + uint32_t max_generation = 0; + + for (parent = current->parents; parent; parent = parent->next) { + if (parent->item->generation == GENERATION_NUMBER_INFINITY || + parent->item->generation == GENERATION_NUMBER_ZERO) { + all_parents_computed = 0; + commit_list_insert(parent->item, &list); + break; + + } else { + struct generation pg; + timestamp_t pt; + get_generation_version_from_commit(parent->item, 5, &pg); + + pt = pg.value1 + pg.date; + + if (pt > max_timestamp) + max_timestamp = pt + 1; + if (pg.value1 > max_generation) + max_generation = pg.value1; + } + } - if (all_parents_computed) { + if (all_parents_computed) { current->generation = (uint32_t)(max_timestamp - current->date) + 1; - pop_commit(&list); + if (current->generation < max_generation + 1) + current->generation = max_generation + 1; + pop_commit(&list); - if (current->generation > GENERATION_NUMBER_MAX) - die(_("generation number gap is too high!")); - } - } - } + if (current->generation > GENERATION_NUMBER_MAX) + die(_("generation number gap is too high!")); + } + } + } } + static void compute_generation_numbers(struct packed_commit_list *commits, int generation_version) { @@ -1038,6 +1099,9 @@ static void compute_generation_numbers(struct packed_commit_list *commits, case 4: /* compute at write time */ return; + case 5: + compute_generation_numbers_5(commits); + return; } } -- 2.23.0.windows.1