Hello everyone! Over the last two weeks, I have worked on refining the performance report on generation numbers. Here are our conclusions: - Corrected Commit Dates With Monotonically Offset (i.e. generation number v5) performs better than topological levels but is still walks too many commits when compared with Corrected Commit Dates. Number of commits walked (git merge-base v4.8 v4.9, on linux repository): Topological Level : 635579 Corrected Commit Date : 167468 Corrected Commit Date With Monotonic Offset: 506577 As such, I am expecting that we will store Corrected Commit Date in an additional chunk (called "generation data chunk") and store topological levels into CDAT. Thus, old Git clients can operate as expected, with new Git clients using the better generation number. - Using a new chunk does affect the locality of reference but did not impact the performance appreciably. - This does increase the size of commit graph file by nearly 5%. You can read more in my report [1] and the pull request with instructions to replicate the results [2]. [1]: https://lore.kernel.org/git/20200703082842.GA28027@Abhishek-Arch/T/#mda33f6e13873df55901768e8fd6d774282002146 [2]: https://github.com/abhishekkumar2718/git/pull/1 I talk a bit more about a patch I worked on, trying to improve performance of commit graph write using buffers which ultimately did not work and is dropped. Up next is actually implementing the generation number and take care of all little details. https://abhishekkumar2718.github.io/programming/2020/07/05/gsoc-weeks-4-5.html Feedback and suggestions welcome! Thanks - Abhishek -------- Re-sending this email as I forgot to cc git@xxxxxxxxxxxxxxx