On Thu, Aug 01, 2024 at 06:06:35AM -0400, Jeff King wrote: > One thing that confused me, though, is setting "num". From the "-w" > diff: > > - num = m->num_objects; > + > + num = m->num_objects + m->num_objects_in_base; > > if (!num) > - return; > + continue; > > Before we only had one midx, so that was our limit. But now we are > looking at "num" as a limit in the global size of the chained midx. > Which feels weird, since we're just considering a single layer here. We > seem to use "num" in two ways: > > - we return if it's 0 (or now continue to the next layer). But > wouldn't we want to do that per-layer? I don't think it will produce > wrong answers, but we're less likely to kick in this early return > (though it's not clear to me when it would ever kick in really; a > zero-length midx?). This is definitely a bug. We should certainly do something like: for (; m; m = m->base_midx) { uint32_t num; if (!m->num_objects) continue; num = m->num_objects + m->num_objects_in_base; /* ... */ } I'll go ahead and fix this one up locally, which is easy enough to do. > So I think it's correct, though it feels like bsearch_one_midx() > should still return the position within that midx (and then > bsearch_midx() could add back m->num_objects_in_base to get a global > position). And then I guess likewise there would need to be a > midx-local version of nth_midxed_object_oid(). Like many of the other changes in this series, it's really a matter of where you put the complexity: either it's in the callers or in the function itself. I think here I prefer having bsearch_one_midx() return the global position, since it is directly usable in other top-level functions within the MIDX API, like being able to pass it directly to nth_midxed_object_oid() and etc. below. Thanks, Taylor