On Tue, Mar 24, 2020 at 7:09 AM gitgitgadget[bot] <notifications@xxxxxxxxxx> wrote: > Contribution format > ~~~~~~~~~~~~~~~~~~~ > This is a bit of an unusual contribution to Git: > > - the heart of the series is in a single commit > - it is under a different license than most of Git > - the code doesn't use the same dialect as and "look like" the rest of > Git > > All three of these are because the code is meant to be a standalone > library that can be used by Git, libgit2, and any other project that > wants to understand reftables. But they create obstacles to reviewing > (one reviewer said he had spent a two-hour block looking at the patch > and not made much headway). > > On the other hand, people mostly agreed that having the ability to > share this code with libgit2 is beneficial. So how can we get a > substantial review without losing that benefit? > > 1. Most importantly, people would like a series of multiple patches > that tell a story. If review of an early patch in the series > reveals significant changes that should happen in reftable > upstream, that's fine --- upstream, that can happen in a patch on > top, whereas in the series being contributed, the patch would be > amended. > > In the end, there would be two different series of commits: the > contributed series, and the upstream history. The upstream > history might be messier, and that's fine. The contributed series > would be applied to git.git. This not very practical. I currently sync the reftable library into git by means of the reftable/update.sh script, and creating a layered commits is a lot of busywork. Also, getting the entire code in one go lets reviewers see how the whole thing fits together. I do understand the complaint, though. The good news is that there already is a story in the source code; you just have to read it in the right order. That order is (bottom-up): * record.{c,h} - (de)serialized single records * block.{c,h} - read and write blocks * writer.c - write a reftable * reader.c - read a reftable * merged.{c,h}, pq.{c,h} - reading a stack of reftables * stack.{c,h} - writing and compacting stacks of reftable on the filesystem. before anyone attempts to review the code, they should read the specification of the format very carefully. > 2. Both Git and libgit2 have abstractions like strbuf. We'd like > reftable to make use of similar abstractions where they make the > code cleaner. Have you looked at slice.{c,h} ? Is there any specific code that anyone had an issue with? > 3. libgit2 has a git_malloc allocator. reftable doesn't necessarily > have to use it or make it pluggable --- at worst, we can #define > malloc to use it. But it's something to be aware of. I added pluggable malloc routines. > Maintenance > ~~~~~~~~~~~ > There's some interest in the maintenance story: if we run into an > issue with the reftable library, do we report and fix it on the git > mailing list or does the reftable library have its own upstream forums > for that? There is no specific forum. The most practical solution would be report to the git mailing list with cc to hanwen@xxxxxxxxxx. Alternatively, an issue on the github project would also work. I don't foresee much need for maintenance, though. (Or do you foresee a need to store more types of data besides refs and reflogs?) > Portability > ~~~~~~~~~~~ > There was some confusion about the scope of what the reftable library > provides. It provides a reader and a writer and the caller is > responsible for connecting those to files. No, it also does transactional updates to a stack of reftables on the filesystem. > There were some questions about mmap usage here (there is none) and The format is suitable for mmap, but it seems premature optimization, given the (lack of) efficiency of loose+packed refs storage that it indends to replace. > whether the library needs some kind of seeking reader interface for > abstracting the OS interface to files. Look at reftable_block_source in reftable.h > I think the upshot is > > 4. Some summary documentation would be helpful e.g. in the README.md, > to point people to what header file a person should start with to > understand the library reftable.h is a good start, as it declares the public interface, and is extensively documented. It also addresses many of the points that this email raised. > 5. People are also interested in the file-oriented part of reftable, > even though this library doesn't implement it (that's patch 6, > which is Git-specific). No, this is incorrect. > Testing > ~~~~~~~ > Git's testsuite has various GIT_TEST_* knobs. A GIT_TEST_REF_BACKEND > knob would be helpful, to allow running the full testsuite with > reftable. That's a good way to suss out edge cases. I agree, but this outside my personal area of expertise. I think this is best tackled by someone else. > The testsuite should *also* include some specific tests designed for > reftable. They can demonstrate edge cases and demonstrate some of the > benefits There are unittests in the upstream project. Edge cases in the library itself should be covered by unittests. Is there interest in adding these to the git project itself? I can add some Git tests for directory/file conflicts and case sensitivity. > We also discussed table-driven tests that can be shared between > implementations but didn't end up saying anything too concrete about > that. My plan is to add a cross-language tests, that passes data between JGit, Go and C implementations to verify that they interoperate. -- Han-Wen Nienhuys - hanwenn@xxxxxxxxx - http://www.xs4all.nl/~hanwen