Junio C Hamano <gitster@xxxxxxxxx> writes: > Jakub Narebski <jnareb@xxxxxxxxx> writes: > >> About moving commit data with generation number v2 to "CDA2" chunk: if >> "CDAT" chunk is missing then (I think) old Git would simply not use >> commit-graph file at all; it may crash, but I don't think so. If "CDAT" >> chunk has zero length... I don't know what would happen then, possibly >> also old Git would simply not use commit-graph data at all. > > Yeah, if it makes it crash, then we cannot use that "missing CDAT" > approach. I have not tested this, but from reading the code it looks like "missing CDAT" makes Git fail softly -- it would return NULL for the commit-graph, and thus not use commit-graph data at all... which might be too high a price (too high performance penalty for old Git). >> Putting generation number v2 into separate chunk (which might be called >> "GEN2" or "OFFS"/"DOFF") has the disadvantage of increasing the on disk >> size of the commit graph, and possibly also increasing memory >> consumption (the latter depends on how it would be handled), but has the >> advantage of being fullly backward compatibile. Old Git would simply >> use generation numbers v1 in "CDAT", new Git would use generation >> numbers v2 in "OFFS" -- combining commit creation date from "CDAT" and >> offset from "OFFS"), > > Do we have an option *not* to record meaningful generation numbers > in CDAT and have the current Git binaries understand and still use > the rest of the graph file, while not using the optimizations that > rely on having generation numbers? If not, then the new version of > Git that tries to be compatible with old one needs to compute both > generation numbers, and we would need to keep the topological number > for quite some time. We can, as Derrick Stolee wrote, put zero (GENERATION_NUMBER_ZERO) for generation number. Without generation number data we lose some of performance improvements, though. On the other hand computing generation number v1 (topological level) and generation number v2 ([monotonic] offset for corrected commit date) should not be much more costly than calculating single generation number, assuming that most of the cost is walking the commit graph. But this would need benchmarking. Also, as Stolee wrote, with generation number v2 in separate chunk we have commit data not together, but split into two areas. >> and there should be no problems with updating >> commit-graph file (either rewriting, or adding new commit-graph to the >> chain). > > Would merging by the current Git also work well (meaning, would > "GEN2" or whatever it does not understand be omitted)? >From the analysis of write_commit_graph_file(), it looks like unknown chunks are simply skipped (ommitted), but I have not checked it in practice. Best, -- Jakub Narębski