Derrick Stolee <dstolee@xxxxxxxxxxxxx> writes: > The generation number of a commit is defined recursively as follows: > > * If a commit A has no parents, then the generation number of A is one. > * If a commit A has parents, then the generation number of A is one > more than the maximum generation number among the parents of A. Very minor nitpick: it would be more readable wrapped differently: * If a commit A has parents, then the generation number of A is one more than the maximum generation number among parents of A. Very minor nitpick: possibly "parents", not "the parents", but I am not native English speaker. > > Add a uint32_t generation field to struct commit so we can pass this > information to revision walks. We use three special values to signal > the generation number is invalid: > > GENERATION_NUMBER_INFINITY 0xFFFFFFFF > GENERATION_NUMBER_MAX 0x3FFFFFFF > GENERATION_NUMBER_ZERO 0 > > The first (_INFINITY) means the generation number has not been loaded or > computed. The second (_MAX) means the generation number is too large to > store in the commit-graph file. The third (_ZERO) means the generation > number was loaded from a commit graph file that was written by a version > of git that did not support generation numbers. Good explanation; I wonder if we want to have it in some shortened form also in comments, and not only in the commit message. > > Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > --- > alloc.c | 1 + > commit-graph.c | 2 ++ > commit.h | 4 ++++ > 3 files changed, 7 insertions(+) I have reordered patches to make it easier to review. > diff --git a/commit.h b/commit.h > index 23a3f364ed..aac3b8c56f 100644 > --- a/commit.h > +++ b/commit.h > @@ -10,6 +10,9 @@ > #include "pretty.h" > > #define COMMIT_NOT_FROM_GRAPH 0xFFFFFFFF > +#define GENERATION_NUMBER_INFINITY 0xFFFFFFFF > +#define GENERATION_NUMBER_MAX 0x3FFFFFFF > +#define GENERATION_NUMBER_ZERO 0 I wonder if it wouldn't be good to have some short in-line comments explaining those constants, or a block comment above them. > > struct commit_list { > struct commit *item; > @@ -30,6 +33,7 @@ struct commit { > */ > struct tree *maybe_tree; > uint32_t graph_pos; > + uint32_t generation; > }; > > extern int save_commit_buffer; All right, simple addition of the new field. Nothing to go wrong here. Sidenote: With 0x7FFFFFFF being (if I am not wrong) maximum graph_pos and maximum number of nodes in commit graph, we won't hit 0x3FFFFFFF generation number limit for all except very, very linear histories. > > diff --git a/alloc.c b/alloc.c > index cf4f8b61e1..e8ab14f4a1 100644 > --- a/alloc.c > +++ b/alloc.c > @@ -94,6 +94,7 @@ void *alloc_commit_node(void) > c->object.type = OBJ_COMMIT; > c->index = alloc_commit_index(); > c->graph_pos = COMMIT_NOT_FROM_GRAPH; > + c->generation = GENERATION_NUMBER_INFINITY; > return c; > } All right, start with initializing it with "not from commit-graph" value after allocation. > > diff --git a/commit-graph.c b/commit-graph.c > index 70fa1b25fd..9ad21c3ffb 100644 > --- a/commit-graph.c > +++ b/commit-graph.c > @@ -262,6 +262,8 @@ static int fill_commit_in_graph(struct commit *item, struct commit_graph *g, uin > date_low = get_be32(commit_data + g->hash_len + 12); > item->date = (timestamp_t)((date_high << 32) | date_low); > > + item->generation = get_be32(commit_data + g->hash_len + 8) >> 2; > + I guess we should not worry about these "magical constants" sprinkled here, like "+ 8" above. Let's examine how it goes, taking a look at commit-graph-format.txt in Documentation/technical/commit-graph-format.txt * The first H (g->hash_len) bytes are for the OID of the root tree. * The next 8 bytes are for the positions of the first two parents [...] So 'commit_data + g->hash_len + 8' is our offset from the start of commit data. All right. * The next 8 bytes store the generation number of the commit and the commit time in seconds since EPOCH. The generation number uses the higher 30 bits of the first 4 bytes. [...] The higher 30 bits of the 4 bytes, which is 32 bits, means that we need to shift 32-bit value 2 bits right, so that we get lower 30 bits of 32-bit value. All right. All 4-byte numbers are in network order. Shouldn't it be ntohl() to convert from network order to host order, and not get_be32()? I guess they are the same (network order is big-endian order), and get_be32() is what rest of git uses... Looks all right. > pptr = &item->parents; > > edge_value = get_be32(commit_data + g->hash_len);