On 4/26/2018 8:58 AM, Derrick Stolee wrote:
n 4/25/2018 10:35 PM, Junio C Hamano wrote:
Derrick Stolee <dstolee@xxxxxxxxxxxxx> writes:
@@ -439,6 +439,9 @@ static void write_graph_chunk_data(struct
hashfile *f, int hash_len,
else
packedDate[0] = 0;
+ if ((*list)->generation != GENERATION_NUMBER_INFINITY)
+ packedDate[0] |= htonl((*list)->generation << 2);
+
packedDate[1] = htonl((*list)->date);
hashwrite(f, packedDate, 8);
The ones that have infinity are written as zero here. The code that
reads the generation field off of a file in fill_commit_graph_info()
and fill_commit_in_graph() both leave such a record in file as-is,
so the reader of what we write out will think it is _ZERO, not _INF.
Not that it matters, as it seems that most of the code being added
by this series treat _ZERO and _INF more or less interchangeably.
But it does raise another question, i.e. do we need both _ZERO and
_INF, or is it sufficient to have just a single _UNKNOWN?
This code is confusing. The 'if' condition is useless, since at this
point every commit should be finite (since we computed generation
numbers for everyone). We should just write the value always.
For the sake of discussion, the value _INFINITY means not in the graph
and _ZERO means in the graph without a computed generation number.
It's a small distinction, but it gives a single boundary to use for
reachability queries that test generation number.
@@ -571,6 +574,46 @@ static void close_reachable(struct
packed_oid_list *oids)
}
}
+static void compute_generation_numbers(struct commit** commits,
+ int nr_commits)
+{
+ int i;
+ struct commit_list *list = NULL;
+
+ for (i = 0; i < nr_commits; i++) {
+ if (commits[i]->generation != GENERATION_NUMBER_INFINITY &&
+ commits[i]->generation != GENERATION_NUMBER_ZERO)
+ continue;
+
+ commit_list_insert(commits[i], &list);
+ while (list) {
+ struct commit *current = list->item;
+ struct commit_list *parent;
+ int all_parents_computed = 1;
+ uint32_t max_generation = 0;
+
+ for (parent = current->parents; parent; parent =
parent->next) {
+ if (parent->item->generation ==
GENERATION_NUMBER_INFINITY ||
+ parent->item->generation ==
GENERATION_NUMBER_ZERO) {
+ all_parents_computed = 0;
+ commit_list_insert(parent->item, &list);
+ break;
+ } else if (parent->item->generation >
max_generation) {
+ max_generation = parent->item->generation;
+ }
+ }
+
+ if (all_parents_computed) {
+ current->generation = max_generation + 1;
+ pop_commit(&list);
+ }
If we haven't computed all parents' generations yet,
current->generation is undefined (or at least "left as
initialized"), so it does not make much sense to attempt to clip it
at _MAX at this point. At leat not yet.
IOW, shouldn't the following two lines be inside the "we now know
genno of all parents, so we can compute genno for commit" block
above?
You're right! Good catch. This code sets every merge commit to _MAX.
It should be in the block above.
+ if (current->generation > GENERATION_NUMBER_MAX)
+ current->generation = GENERATION_NUMBER_MAX;
+ }
+ }
This bothered me: why didn't I catch a bug here? I rebased my "fsck" RFC
onto this branch and it succeeded. Then, I realized that this does not
actually write incorrect values, since we re-visit this commit again
after we pop the stack down to this commit. However, there is time in
the middle where we have set the generation (in memory) incorrectly and
that could easily turn into a real bug by a later change.
I'll stick the _MAX check in the if above to prevent confusion.
Thanks,
-Stolee