Re: [PATCH] repack: only repack .packs that exist

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 27, 2023 at 03:54:27AM -0400, Jeff King wrote:
> > In other cases, Git prepares its list of packfiles by scanning .idx
> > files and then only adds it to the packfile list if the corresponding
> > .pack file exists. It even does so without a warning! (See
> > add_packed_git() in packfile.c for details.)
>
> Interesting. I'd have expected a warning. ;)

Reading this all over, I was incorrect in my original suggestion that
not checking for the existence of the matching ".pack" file was the
expected thing to do.

We don't try to open the ".pack" file itself in add_packed_git(), that
happens later in open_packed_git() if we end up actually needing to read
objects from the pack, or want to retain a handle on it for later if
we're worried that the ".pack" file itself might get deleted.

But that add_packed_git() silently ignores a pack which has its ".idx"
file and not its ".pack" file is behavior that I wasn't aware of, and
had somehow missed when I reread the function last week.

Stolee: I apologize for having missed this important detail. I think in
that sense matching the behavior of add_packed_git() here is the right
thing to do, and that this patch makes sense to me.

> I also kind of wonder if this repack code should simply be loading and
> iterating the packed_git list, but that is a much bigger change.

I have wanted to do this for a while ;-). The minimal patch is less
invasive than I had thought:

--- 8< ---
diff --git a/builtin/repack.c b/builtin/repack.c
index 1e21a21ea8..e854c832fd 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -104,46 +104,33 @@ static void collect_pack_filenames(struct string_list *fname_nonkept_list,
 				   struct string_list *fname_kept_list,
 				   const struct string_list *extra_keep)
 {
-	DIR *dir;
-	struct dirent *e;
-	char *fname;
+	struct packed_git *p;
 	struct strbuf buf = STRBUF_INIT;

-	if (!(dir = opendir(packdir)))
-		return;
-
-	while ((e = readdir(dir)) != NULL) {
-		size_t len;
+	for (p = get_all_packs(the_repository); p; p = p->next) {
 		int i;
-
-		if (!strip_suffix(e->d_name, ".idx", &len))
-			continue;
-
-		strbuf_reset(&buf);
-		strbuf_add(&buf, e->d_name, len);
-		strbuf_addstr(&buf, ".pack");
+		const char *base = basename(p->pack_name);

 		for (i = 0; i < extra_keep->nr; i++)
-			if (!fspathcmp(buf.buf, extra_keep->items[i].string))
+			if (!fspathcmp(base, extra_keep->items[i].string))
 				break;

-		fname = xmemdupz(e->d_name, len);
+		strbuf_reset(&buf);
+		strbuf_addstr(&buf, base);
+		strbuf_strip_suffix(&buf, ".pack");

-		if ((extra_keep->nr > 0 && i < extra_keep->nr) ||
-		    (file_exists(mkpath("%s/%s.keep", packdir, fname)))) {
-			string_list_append_nodup(fname_kept_list, fname);
-		} else {
+		if ((extra_keep->nr > 0 && i < extra_keep->nr) || p->pack_keep)
+			string_list_append(fname_kept_list, buf.buf);
+		else {
 			struct string_list_item *item;
-			item = string_list_append_nodup(fname_nonkept_list,
-							fname);
-			if (file_exists(mkpath("%s/%s.mtimes", packdir, fname)))
+			item = string_list_append(fname_nonkept_list, buf.buf);
+			if (p->is_cruft)
 				item->util = (void*)(uintptr_t)CRUFT_PACK;
 		}
 	}
-	closedir(dir);
-	strbuf_release(&buf);

 	string_list_sort(fname_kept_list);
+	strbuf_release(&buf);
 }

 static void remove_redundant_pack(const char *dir_name, const char *base_name)
--- >8 ---

I think you could probably go further than this, since having to store
the suffix-less pack names in the fname_kept and fname_nonkept lists is
kind of weird.

It would be nice if we could store pointers to the actual packed_git
structs themselves in place of those lists instead, but I'm not
immediately sure how feasible it would be to do since we re-prepare the
object store between enumerating and then removing these packs.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux