On Tue, Jun 27, 2023 at 03:54:27AM -0400, Jeff King wrote: > > In other cases, Git prepares its list of packfiles by scanning .idx > > files and then only adds it to the packfile list if the corresponding > > .pack file exists. It even does so without a warning! (See > > add_packed_git() in packfile.c for details.) > > Interesting. I'd have expected a warning. ;) Reading this all over, I was incorrect in my original suggestion that not checking for the existence of the matching ".pack" file was the expected thing to do. We don't try to open the ".pack" file itself in add_packed_git(), that happens later in open_packed_git() if we end up actually needing to read objects from the pack, or want to retain a handle on it for later if we're worried that the ".pack" file itself might get deleted. But that add_packed_git() silently ignores a pack which has its ".idx" file and not its ".pack" file is behavior that I wasn't aware of, and had somehow missed when I reread the function last week. Stolee: I apologize for having missed this important detail. I think in that sense matching the behavior of add_packed_git() here is the right thing to do, and that this patch makes sense to me. > I also kind of wonder if this repack code should simply be loading and > iterating the packed_git list, but that is a much bigger change. I have wanted to do this for a while ;-). The minimal patch is less invasive than I had thought: --- 8< --- diff --git a/builtin/repack.c b/builtin/repack.c index 1e21a21ea8..e854c832fd 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -104,46 +104,33 @@ static void collect_pack_filenames(struct string_list *fname_nonkept_list, struct string_list *fname_kept_list, const struct string_list *extra_keep) { - DIR *dir; - struct dirent *e; - char *fname; + struct packed_git *p; struct strbuf buf = STRBUF_INIT; - if (!(dir = opendir(packdir))) - return; - - while ((e = readdir(dir)) != NULL) { - size_t len; + for (p = get_all_packs(the_repository); p; p = p->next) { int i; - - if (!strip_suffix(e->d_name, ".idx", &len)) - continue; - - strbuf_reset(&buf); - strbuf_add(&buf, e->d_name, len); - strbuf_addstr(&buf, ".pack"); + const char *base = basename(p->pack_name); for (i = 0; i < extra_keep->nr; i++) - if (!fspathcmp(buf.buf, extra_keep->items[i].string)) + if (!fspathcmp(base, extra_keep->items[i].string)) break; - fname = xmemdupz(e->d_name, len); + strbuf_reset(&buf); + strbuf_addstr(&buf, base); + strbuf_strip_suffix(&buf, ".pack"); - if ((extra_keep->nr > 0 && i < extra_keep->nr) || - (file_exists(mkpath("%s/%s.keep", packdir, fname)))) { - string_list_append_nodup(fname_kept_list, fname); - } else { + if ((extra_keep->nr > 0 && i < extra_keep->nr) || p->pack_keep) + string_list_append(fname_kept_list, buf.buf); + else { struct string_list_item *item; - item = string_list_append_nodup(fname_nonkept_list, - fname); - if (file_exists(mkpath("%s/%s.mtimes", packdir, fname))) + item = string_list_append(fname_nonkept_list, buf.buf); + if (p->is_cruft) item->util = (void*)(uintptr_t)CRUFT_PACK; } } - closedir(dir); - strbuf_release(&buf); string_list_sort(fname_kept_list); + strbuf_release(&buf); } static void remove_redundant_pack(const char *dir_name, const char *base_name) --- >8 --- I think you could probably go further than this, since having to store the suffix-less pack names in the fname_kept and fname_nonkept lists is kind of weird. It would be nice if we could store pointers to the actual packed_git structs themselves in place of those lists instead, but I'm not immediately sure how feasible it would be to do since we re-prepare the object store between enumerating and then removing these packs. Thanks, Taylor