On Mon, Oct 12, 2020 at 6:57 PM Jonathan Nieder <jrnieder@xxxxxxxxx> wrote: > > We could, but it would considerably complicate work on this patch > > series, as the commit boundary then doesn't fall on file boundaries > > anymore. Would you be open to having multiple headers for the public > > interface? eg. reftable-record.h, reftable-reader.h etc. ? > > I'd be more than open to it: I think that would make for a clearer API. Done. > >> Should these call BUG()? Or is it useful for some callers to be able > >> to recover from these errors? > > > > Since this was written as a standalone library, it's up to the caller > > to decide what should be done. > > I'm not strongly opinionated about this, but just a quick note: this > implies that the library would need to make sure it is producing a > valid state in error cases. This is what it already does. I tried to clarify this in some places. > Does the API documentation describe what state a handle is in after > an error --- e.g., what operations are permitted after that? API_ERROR should not result in state changes. > >>> +int reftable_error_to_errno(int err); > >> > >> What is the intended use of this function? > > > > The read_raw_ref method in the ref backend API uses errno values as > > out-of-band communication mechanism. > > Could we change that? It sounds error-prone. I moved it into refs/reftable-backend.c; I'd rather not also have to clean up the files backend as part of this work, though. > >> Do I pass in the 'struct block_source *' as the source arg? If so, why > >> are these declared as void *? > > > > you pass in block_source->arg. Should struct implementations of > > polymorphic types carry a pointer to their own vtable instead? > > I think a pointer to block_source would make it more self-explanatory, > yes. This has a number of downsides, though. It means I have to introduce separate structs strbuf_blocksource and file_blocksource, along with functions to create and deallocate them, so it makes the whole thing more unwieldy to use. > >> Is the reason this manages the buffer instead of requiring a > >> caller-supplied buffer to support zero-copy? > > > > Log blocks are compressed, so the caller doesn't know the correct size > > to supply. By letting the block source handle the management, we can > > swap out the block read from the file for a block managed by malloc on > > decompressing a log block. > > Hm, my naive assumption would have been that we'd use different > buffers for the compressed and uncompressed data. Maybe you can comment on how to solve this differently when we get to the implementation of block reading/writing in the commit series. > >>> + Generic tables > >>> + > >>> + A unified API for reading tables, either merged tables, or single readers. > >> > >> Are there callers/helpers that don't know whether they want one or the > >> other? > > > > The setup with per-worktree refs means that there are two reftable > > stacks in a .git repo. In order to iterate over the entire ref space, > > you have to merge two merged tables, but the stack itslef merges a set > > of simple reftables. > > I see. That's subtle, so it seems worth documenting for the next > person reading this documentation and wondering why. Done. -- Han-Wen Nienhuys - Google Munich I work 80%. Don't expect answers from me on Fridays. -- Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschäftsführer: Paul Manicle, Halimah DeLaine Prado