On Thu, Mar 11, 2021 at 12:05:29PM -0500, Taylor Blau wrote: > Implement reading for multi-pack reverse indexes, as described in the > previous patch. Looks good overall. I found a few tiny nits below. > +int load_midx_revindex(struct multi_pack_index *m) > +{ > + char *revindex_name; > + int ret; > + if (m->revindex_data) > + return 0; > + > + revindex_name = get_midx_rev_filename(m); > + > + ret = load_revindex_from_disk(revindex_name, > + m->num_objects, > + &m->revindex_map, > + &m->revindex_len); > + if (ret) > + goto cleanup; On error, I wondered if m->revindex_map, etc, would be modified. But it looks like no, load_revindex_from_disk() is careful not to touch them unless it sees a valid revindex. Good. > +int close_midx_revindex(struct multi_pack_index *m) > +{ > + if (!m) > + return 0; > + > + if (munmap((void*)m->revindex_map, m->revindex_len)) > + return -1; > + > + m->revindex_map = NULL; > + m->revindex_data = NULL; > + m->revindex_len = 0; > + > + return 0; > +} It's hard to imagine why munmap() would fail. But if it does, we should probably clear the struct fields anyway. I note that the matching code for a "struct packed_git" does not bother even checking the return value of munmap. Perhaps we should just do the same here. The packed_git version also returned early if revindex_map is NULL. Here the burden is placed on the caller (it's hard to tell if that matters since there aren't any callers yet, but it probably makes sense to push the check down into this function). > +uint32_t pack_pos_to_midx(struct multi_pack_index *m, uint32_t pos) > +{ > + if (!m->revindex_data) > + BUG("pack_pos_to_midx: reverse index not yet loaded"); > + if (m->num_objects <= pos) > + BUG("pack_pos_to_midx: out-of-bounds object at %"PRIu32, pos); > + return get_be32((const char *)m->revindex_data + (pos * sizeof(uint32_t))); > +} OK, this one is just a direct read of the .rev data, like pack_pos_to_index() is. I think the final line can be simplified to: return get_be32(m->revindex_data + pos); just like pack_pos_to_index(). (I suspect this is a leftover from the earlier version of your .rev series where the pointer was still a "void *"). > +int midx_to_pack_pos(struct multi_pack_index *m, uint32_t at, uint32_t *pos) > +{ > + struct midx_pack_key key; > + uint32_t *found; > + > + if (!m->revindex_data) > + BUG("midx_to_pack_pos: reverse index not yet loaded"); > + if (m->num_objects <= at) > + BUG("midx_to_pack_pos: out-of-bounds object at %"PRIu32, at); > + > + key.pack = nth_midxed_pack_int_id(m, at); > + key.offset = nth_midxed_offset(m, at); > + key.midx = m; > + /* > + * The preferred pack sorts first, so determine its identifier by > + * looking at the first object in pseudo-pack order. > + * > + * Note that if no --preferred-pack is explicitly given when writing a > + * multi-pack index, then whichever pack has the lowest identifier > + * implicitly is preferred (and includes all its objects, since ties are > + * broken first by pack identifier). > + */ > + key.preferred_pack = nth_midxed_pack_int_id(m, pack_pos_to_midx(m, 0)); > + > + found = bsearch(&key, m->revindex_data, m->num_objects, > + sizeof(uint32_t), midx_pack_order_cmp); OK, this one is _roughly_ equivalent to offset_to_pack_pos(), in that we have to binary search within the pack-ordered list to find the entry. Makes sense. Probably sizeof(*m->revindex_data) would be slightly nicer in the bsearch call (again, I suspect a holdover from when that was a void pointer). -Peff