On Tue, Aug 25, 2020 at 12:07:17PM +0200, Jakub Narębski wrote: > Hello, > > ... > > I think I was not clear enough (in trying to be brief). I meant here > loading available generation numbers for use in graph traversal, > done in later patches in this series. > > In _next_ commit we store topological levels in `generation` field: > > @@ -755,7 +763,11 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, > date_low = get_be32(commit_data + g->hash_len + 12); > item->date = (timestamp_t)((date_high << 32) | date_low); > > - graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; > + if (g->chunk_generation_data) > + graph_data->generation = item->date + > + (timestamp_t) get_be32(g->chunk_generation_data + sizeof(uint32_t) * lex_index); > + else > + graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; > > > We use topo_level slab only when writing the commit-graph file. > Right, I thought the agenda outlined points in the process of writing commit-graph file. > > > We could avoid initializing topo_slab if we are not writing generation > > data chunk (and thus don't need corrected commit dates) but that > > wouldn't have an impact on run time while writing commit-graph because > > computing corrected commit dates is cheap as the main cost is in walking > > the graph and writing the file. > > Right. > > Though you need to add the cost of allocation and managing extra > commit slab, I think that amortized cost is negligible. > > But what would be better is showing benchmark data: does writing the > commit graph without GDAT take not insigificant more time than without > this patch? Right, we could compare time taken by master and series until (but not including this patcth) to write a commit-graph file. Will add. > > [...] > >>> @@ -2372,8 +2384,8 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags) > >>> for (i = 0; i < g->num_commits; i++) { > >>> struct commit *graph_commit, *odb_commit; > >>> struct commit_list *graph_parents, *odb_parents; > >>> - timestamp_t max_generation = 0; > >>> - timestamp_t generation; > >>> + timestamp_t max_corrected_commit_date = 0; > >>> + timestamp_t corrected_commit_date; > >> > >> This is simple, and perhaps unnecessary, rename of variables. > >> Shouldn't we however verify *both* topological level, and > >> (if exists) corrected commit date? > > > > The problem with verifying both topological level and corrected commit > > dates is that we would have to re-fill commit_graph_data slab with commit > > data chunk as we cannot modify data->generation otherwise, essentially > > repeating the whole verification process. > > > > While it's okay for now, I might take this up in a future series [1]. > > > > [1]: https://lore.kernel.org/git/4043ffbc-84df-0cd6-5c75-af80383a56cf@xxxxxxxxx/ > > All right, I believe you that verifying both topological level and > corrected commit date would be more difficult. > > That doesn't change the conclusion that this variable should remain to > be named `generation`, as when verifying GDAT-less commit-graph files it > would check topological levels (it uses commit_graph_generation(), which > in turn uses `generation` field in commit graph info, which as I have > show above in later patch could be v1 or v2 generation number). > Right, I completely misunderstood you initially. Reverted the variable name changes. > Best, > -- > Jakub Narębski