Hi all, I have noticed that not everyone is happy with the reftable series as it currently stands. I wanted to give some background on why it looks the way it does today, I want to solicit feedback on where to go from here, and share my worries about the negative feedback I’ve gotten so far. I manage the Gerrit team at Google. Gerrit uses refs extensively: we have many repositories with millions of refs. To address the problems this caused for Google, Shawn designed the reftable storage format, and we have been running it in production since 2017. Unfortunately, this doesn’t solve the problem for ‘normal’ deployments of Gerrit, and this limits design choices that my team can make. To fix this, I have implemented support for reftable on local file systems in JGit, and want to make it a feature that is supported by core git too. I found the implementation somewhat gnarly, which makes it an interesting project. I thought it would be good for the ecosystem if related projects such as libgit2 could use the same implementation, so I wrote it as a library that is completely standalone (except for a dependency on zlib). It was originally written in Go, translated to as idiomatic C as I could manage. It is basically object-oriented with a small amount of polymorphism, so it should translate relatively easily to languages like Python. As it currently stands, the library implements the specification completely, including the parts that are currently probably not useful for Git, like support for the object-ID/ref mapping. The API between reftable and git is extremely narrow: it’s about 20 functions, and 5 structure definitions. It’s also well tested (see coverage [2]. As of today, the tests are leak-free according to valgrind). This is also why I don’t believe Johannes’ argument that, because he found one bug, it must be full of bugs. This approach has been successful, in the sense that the libgit2 project has experimented with integrating it, see https://github.com/libgit2/libgit2/pull/5462. They seem happy at the prospect of integrating existing code rather than reimplementing it. Johannes had suggested that this should be developed and maintained in git-core first, and the result could then be reused by libgit2 project. According to the libgit2 folks, this what that would look like: “”” - It needs to be easy to split out from git-core. If it is self-contained in a single directory, then I'd be sufficiently happy already. - It should continue providing a clean interface to external callers. This also means that its interface should stay stable so we don't have to adapt on every update. git-core historically never had such promises, but it kind of worked out for the xdiff code. - My most important fear would be that the reftable interface becomes heavily dependent on git-core's own data types. We had this discussion at the Contributor's Summit already, but if it starts adopting things like git-core's own string buffer then it would become a lot harder for us to use it. - Probably obvious, but contained in the above is that it stays compilable on its own. So even if you split out its directory and wire up some build instructions, it should not have any dependencies on git-core. ””” (for the discussion at the summit: https://lore.kernel.org/git/1B71B54C-E000-4CEB-8AC6-3DB86E96E31A@xxxxxxxxxxxxxx/) I can make that work, but it would be good to know if this is something the project is OK with in principle, or whether the code needs to be completely Git-ified. If the latter happens, that would effectively fork the code, which I think is a missed opportunity. I have received a lot of negative comments, and I can’t gauge well what to make of them, but they kindle several worries: * The git community decides they don’t need the object reverse index, and do not write ‘o’ section in reftables, because that culls 1000 lines of code. This generally works, but will cause hard to debug performance regressions if command-line git is used on a gerrit server. * The git community looks at this, and decides the standard is too complex, and goes off to create a reftable v3. * The git community asks me to do a ton of superficial work (eg. slice -> strbuf), and then decides the overall design needs to be different, and should be completely rewritten. Jonathan Nieder said I shouldn’t worry about standards compliance, because the Git project has already ratified the reftable standard, and wouldn’t want to break JGit compatibility, but it would be good to have the community leaders reaffirm this stance. For my last worry, it would be good if someone would make an assessment of the overall design to see if it is acceptable. Once we have a path forward we can think of a way of integrating the code. I think it may take a while to shake out bugs. Not bugs in the reftable library itself, but Git is not very strict in enforcing proper use of the ref backend abstraction (eg. pseudorefs [1]). Many integration tests also don’t go through proper channels for accessing refs. cheers, [1] https://lore.kernel.org/git/CAFQ2z_NZgkPE+3oazfb_m0_7TWxHjje1yYCc0bMZG05_KUKEow@xxxxxxxxxxxxxx/ [2] https://hanwen.home.xs4all.nl/public/software/reftable-coverage/ -- Han-Wen Nienhuys - Google Munich I work 80%. Don't expect answers from me on Fridays. -- Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschäftsführer: Paul Manicle, Halimah DeLaine Prado