On 4/10/2018 10:31 PM, Junio C Hamano wrote:
Derrick Stolee <dstolee@xxxxxxxxxxxxx> writes:
The generation number of a commit is defined recursively as follows:
* If a commit A has no parents, then the generation number of A is one.
* If a commit A has parents, then the generation number of A is one
more than the maximum generation number among the parents of A.
Add a uint32_t generation field to struct commit so we can pass this
information to revision walks. We use two special values to signal
the generation number is invalid:
GENERATION_NUMBER_ININITY 0xFFFFFFFF
GENERATION_NUMBER_ZERO 0
The first (_INFINITY) means the generation number has not been loaded or
computed. The second (_ZERO) means the generation number was loaded
from a commit graph file that was stored before generation numbers
were computed.
Should it also be possible for a caller to tell if a given commit
has too deep a history, i.e. we do not know its generation number
exactly, but we know it is larger than 1<<30?
It seems that we only have a 30-bit field in the file, so wouldn't
we need a special value defined in (e.g. "0") so that we can tell
that the commit has such a large generation number? E.g.
+ item->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
if (!item->generation)
item->generation = GENERATION_NUMBER_OVERFLOW;
when we read it from the file?
We obviously need to do something similar when assigning a
generation number to a child commit, perhaps like
#define GENERATION_NUMBER_OVERFLOW (GENERATION_NUMBER_MAX + 1)
commit->generation = 1; /* assume no parent */
for (p = commit->parents; p; p++) {
uint32_t gen = p->item->generation + 1;
if (gen >= GENERATION_NUMBER_OVERFLOW) {
commit->generation = GENERATION_NUMBER_OVERFLOW;
break;
} else if (commit->generation < gen)
commit->generation = gen;
}
or something? And then on the writing side you'd encode too large a
generation as '0'.
You raise a very good point. How about we do a slightly different
arrangement for these overflow commits?
Instead of storing the commits in the commit-graph file as "0" (which
currently means "written by a version of git that did not compute
generation numbers") we could let GENERATION_NUMBER_MAX be the maximum
generation of a commit in the commit-graph, and if a commit would have
larger generation, we collapse it down to that value.
It slightly complicates the diagram I made in
Documentation/technical/commit-graph.txt, but it was already a bit of a
simplification. Here is an updated diagram, but likely we will want to
limit discussion of the special-case GENERATION_NUMBER_MAX to the prose,
since it is not a practical situation at the moment.
+-----------------------------------------+
| GENERATION_NUMBER_INFINITY = 0xFFFFFFFF |
+-----------------------------------------+
| | | ^
| | | |
| | +------+
| | [gen(A) = gen(B)]
| V
| +------------------------------------+
| | GENERATION_NUMBER_MAX = 0x3FFFFFFF |
| +------------------------------------+
| | | ^
| | | |
| | +------+
| | [gen(A) = gen(B)]
V V
+-------------------------------------+
| 0 < commit->generation < 0x3FFFFFFF |
+-------------------------------------+
| | ^
| | |
| +------+
| [gen(A) > gen(B)]
V
+-------------------------------------+
| GENERATION_NUMBER_ZERO = 0 |
+-------------------------------------+
| ^
| |
+------+
[gen(A) = gen(B)]
Thanks,
-Stolee