On Fri, Jan 08, 2021 at 01:16:43PM -0500, Taylor Blau wrote: > In the next several patches, we will prepare for loading a reverse index > either in memory, or from a yet-to-be-introduced on-disk format. To do > that, we'll introduce an API that avoids the caller explicitly indexing > the revindex pointer in the packed_git structure. This API looks good to me. Here are a few extra thoughts: > There are four ways to interact with the reverse index. Accordingly, > four functions will be exported from 'pack-revindex.h' by the time that > the existing API is removed. A caller may: This tells us what the new API functions do. That's useful, but should it be in the header file itself, documenting each function? Likewise, I think we'd want to define the concepts in that documentation. Something like: /* * A revindex allows converting efficiently between three properties * of an object within a pack: * * - index position: the numeric position within the list of * sorted object ids found in the .idx file * * - pack position: the numeric position within the list of objects * in their order within the actual .pack file (i.e., 0 is the * first object in the .pack, 1 is the second, and so on) * * - offset: the byte offset within the .pack file at which the * object contents can be found */ And then above each function we can just say that it converts X to Y (like you have in the commit message). It may also be worth indicating the run-time of each (some of them are constant-time once you have a revindex, and some are log(n)). > +int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos) The types here make sense. off_t is clearly needed for a pack offset, and uint32_t is correct for the position fields, because packs have a 4-byte object count. Separating the error return from the out-parameter makes the interface slightly more awkward, but is needed to use the properly-sized types. Makes sense. > +int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos) > +{ > + int ret; > + > + if (load_pack_revindex(p) < 0) > + return -1; This one lazy-loads the revindex for us, which seems handy... > +uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos) > +{ > + if (!p->revindex) > + BUG("pack_pos_to_index: reverse index not yet loaded"); > + if (pos >= p->num_objects) > + BUG("pack_pos_to_index: out-of-bounds object at %"PRIu32, pos); > + return p->revindex[pos].nr; > +} But these ones don't. I'm glad we at least catch it with a BUG(), but it makes the API a little funny. Returning an error here would require a similarly awkward out-parameter, I guess. -Peff