Taylor Blau <me@xxxxxxxxxxxx> writes: > An alternative approach to closing this race would have been to call > `is_pack_valid()` on _all_ packs in a multi-pack bitmap on load. This > has a couple of problems: > > - it is unnecessarily expensive in the cases where we don't actually > need to open any packs (e.g., in `git rev-list --use-bitmap-index > --count`) > > - more importantly, it means any time we would have hit this race, > we'll avoid using bitmaps altogether, leading to significant > slowdowns by forcing a full object traversal This answers a question I had about why we're only opening the preferred pack instead of all packs. (You mention in [1] that it's also answered in that patch message, but I didn't see it.) In any case, it might be clearer to move this part to the 1st commit. [1] https://lore.kernel.org/git/Yn63nDhSBIEa3%2F+%2F@nand.local/ > Work around this by calling `is_pack_valid()` from within > `want_found_object()`, matching the behavior in > `want_object_in_pack_one()` (which has an analogous call). Most calls to > `is_pack_valid()` should be basically no-ops, since only the first call > requires us to open a file (subsequent calls realize the file is already > open, and return immediately). > > This does require us to make a small change in > `want_object_in_pack_one()`, since `want_found_object()` may return `-1` > (indicating "keep searching for other packs containing this object") > when `*found_pack` is non-NULL. Force `want_object_in_pack_one()` to > call `is_pack_valid()` when `p != *found_pack`, not just when > `*found_pack` is non-NULL. It took me a while to realize that the relevant want_found_object() invocation that may return -1 is not the one in want_object_in_pack_one(), but in the latter's caller want_object_in_pack(). But even in this case, couldn't want_found_object() return -1 (see the very end of the function) even before this patch? > @@ -1424,14 +1427,15 @@ static int want_object_in_pack_one(struct packed_git *p, > off_t *found_offset) > { > off_t offset; > + int use_found = p == *found_pack; > > - if (p == *found_pack) > + if (use_found) > offset = *found_offset; > else > offset = find_pack_entry_one(oid->hash, p); > > if (offset) { > - if (!*found_pack) { > + if (!use_found) { > if (!is_pack_valid(p)) > return -1; > *found_offset = offset; My understanding of the purpose of this code change is that if we reach here with a non-NULL *found_pack, it is likely that *found_pack contains an invalid pack, and this part overwrites *found_pack (and *found_offset) if it finds a valid pack. This seems like a good change, but I don't see how this is a result of something that "does require us" (as far as I can tell, *found_pack could have already been invalid before this patch, so the downstream code should already be able to handle it). Maybe it's just that we couldn't tell if the pack is invalid previously, but now we can; but if so, it would be better to say "use this added information to overwrite *found_pack when it makes sense" or something like that. (An alternative to the change in this patch may be to reset *found_pack to NULL when it is found that the pack is invalid, but I haven't investigated all the callers to see if they can tolerate *found_pack moving changing non-NULL to NULL, so the change in this patch is probably more practical.) Other than that (and the "close(fd)" thing in patch 1 that Junio mentioned [2]), this series looks good to me. Thanks for noticing and fixing the bug. [2] https://lore.kernel.org/git/xmqqpmkh9tye.fsf@gitster.g/