On 3/26/2021 3:12 PM, Derrick Stolee via GitGitGadget wrote: > As I prepare some ideas on index v5, one thing that strikes me as an > interesting direction to try is to use the chunk-format API. This would make > our extension model extremely simple (they become optional chunks, easily > identified by the table of contents). > > But there is a huge hurdle to even starting that investigation: the index > uses its own hashing methods, separate from the hashfile API in csum-file.c! > > The internals of the algorithms are mostly identical. The only possible > change is that the buffer sizes are different: 8KB for hashfile and 128KB in > read-cache.c. I was unable to find a performance difference in these two > implementations, despite testing on several repo sizes. Of course, shortly after I send this series (thinking I've checked all the details carefully) I notice that I was using "git update-index --really-refresh" for testing, but what I really wanted was "git update-index --force-write". In this case, I _do_ see a performance degradation using the hashfile API. I will investigate whether this is just a poor implementation of the nesting hashfile, or something else more tricky. Changing the buffer size doesn't do the trick. Please ignore this series for now. Sorry for the noise. -Stolee