Thanks for the quick response and for the patch. I started writing such a long process document because I _thought_ that I found a major issue with the Large Edge List. But, in the end, it was user error. It turned out that I was looking at index '11' when I should have been looking at index '0x11' ('17'). Whoops! On Thu, Jun 28, 2018 at 5:49 AM Derrick Stolee <stolee@xxxxxxxxx> wrote: > > On 6/28/2018 1:11 AM, Grant Welch wrote: > > I recently read the "Supercharging the Git Commit Graph blog by > > Derrick Stolee. I found the article interesting and wanted to verify > > the performance numbers for myself. Then that led me to want to know > > more about the implementation, so I read the technical design notes in > > commit-graph.txt, and then I jumped into the format documentation in > > commit-graph-format.txt. > > > > Along the way, I noticed a few issues. They might just be errors in > > the documentation, but I figured it was worth documenting my entire > > process just to be sure. > > > > "Supercharging the Git Commit Graph", by Derrick Stolee: > > https://blogs.msdn.microsoft.com/devops/2018/06/25/supercharging-the-git-commit-graph/ > > > > # TL;DR > > > > I found a few discrepencies between the documentation in > > commit-graph-format.txt and the results that I observed for myself. > > > > 1. The "Commit Data" chunk signature is documented to be 'CGET', but > > it should be 'CDAT'. > > > > commit-graph.c:18 > > #define GRAPH_CHUNKID_DATA 0x43444154 /* "CDAT" */ > > Thanks for catching this! Thankfully, this is an easy fix, as we only > need to update the documentation. > > > 2. The "no parent" value is documented to be 0xffffffff, but is > > actually 0x70000000. > > > > commit-graph.c:34 > > #define GRAPH_PARENT_NONE 0x70000000 > > This is a more important mistake, as it affects the data that was > written in the commit-graph file. > > Part of the problem is that leading hex digit of 0x7 which in binary is > 0b0111. We already designed a limit of at most 2^{31}-1 (~2.1 billion) > commits in the commit-graph because of the way we track octopus edges, > but this mistake has cost us more: we cannot store more than ~1.8 > billion commits. > > I'm sorry for this mixup, mostly because it is aesthetically unpleasant. > Those extra 300 million commits mean less to me than having a clean file > format. > > > 3. The "generation" field isn't set on any of the commits. (I don't > > know what to make of this.) > > This is a difference between 2.18 and current 'master', which merged > ds/generation-numbers. Commit-graphs written with Git 2.18 have all > generation numbers listed as GENERATION_NUMBER_ZERO (0), which lets > future versions know that the generation number is not computed yet, so > the next commit-graph write will compute the correct generation number. > > I'll send a patch soon fixing these doc issues. > > Thanks, > -Stolee -- -grant welch -gwelch925@xxxxxxxxx