On Mon, Oct 21, 2024 at 04:33:15PM -0400, Taylor Blau wrote: > > @@ -2388,8 +2389,24 @@ static char *fetch_pack_index(unsigned char *hash, const char *base_url) > > strbuf_addf(&buf, "objects/pack/pack-%s.idx", hash_to_hex(hash)); > > url = strbuf_detach(&buf, NULL); > > > > - strbuf_addf(&buf, "%s.temp", sha1_pack_index_name(hash)); > > - tmp = strbuf_detach(&buf, NULL); > > + /* > > + * Don't put this into packs/, since it's just temporary and we don't > > + * want to confuse it with our local .idx files. We'll generate our > > + * own index if we choose to download the matching packfile. > > + * > > + * It's tempting to use xmks_tempfile() here, but it's important that > > + * the file not exist, otherwise http_get_file() complains. So we > > + * create a filename that should be unique, and then just register it > > + * as a tempfile so that it will get cleaned up on exit. > > + * > > + * Arguably it would be better to hold on to the tempfile handle so > > + * that we can remove it as soon as we download the pack and generate > > + * the real index, but that might need more surgery. > > + */ > > + tmp = xstrfmt("%s/tmp_pack_%s.idx", > > + repo_get_object_directory(the_repository), > > + hash_to_hex(hash)); > > + register_tempfile(tmp); > > Makes perfect sense, and the comment above here is much appreciated. > > I thought about trying to use some intermediate state of the strbuf here > to avoid an extra xstrfmt() call, but couldn't come up with anything I > didn't think was awkward. I don't think there's any useful intermediate state. The earlier %s is the base url, but here it's our local directory. We could continue to re-use the scratch strbuf as the existing code did (and which xstrfmt() is doing under the hood). It wasn't really intentional for me to change that, but I went through a lot of attempts to get here (using mks_tempfile(), and so on). > > +static char *pack_path_from_idx(const char *idx_path) > > +{ > > + size_t len; > > + if (!strip_suffix(idx_path, ".idx", &len)) > > + BUG("idx path does not end in .idx: %s", idx_path); > > + return xstrfmt("%.*s.pack", (int)len, idx_path); > > +} > > + > > struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path) > > { > > - const char *path = sha1_pack_name(sha1); > > + char *path = pack_path_from_idx(idx_path); > > Huh. I would have thought we have such a helper function already. I > guess we probably do, but that it's also defined statically because it's > so easy to write. I thought so, too, but couldn't find one. We have pack_bitmap_filename() (and so on for .rev and .midx files) that goes from .pack to those extensions. But here we want to go from .idx to .pack. I think most stuff goes from ".pack" because that's what we store in the packed_git struct. There's also sha1_pack_index_name(), but that goes from a csum-file hash to a filename. I grepped around and strip_suffix() seems to be par for the course in similar situations within pack/repack code, so I think it's OK here. > In any case, this looks like the right thing to do here. It would be > nice to have a corresponding test here, since unlike the other > finalize_object_file() changes, this one can be provoked > deterministically. > > Would you mind submitting this as a bona-fide patch, which I can then > pick up and start merging down? Yeah, the test is easy: diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index 58189c9f7d..50a7b98813 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -507,4 +507,14 @@ test_expect_success 'fetching via http alternates works' ' git -c http.followredirects=true clone "$HTTPD_URL/dumb/alt-child.git" ' +test_expect_success 'dumb http can fetch index v1' ' + server=$HTTPD_DOCUMENT_ROOT_PATH/idx-v1.git && + git init --bare "$server" && + git -C "$server" --work-tree=. commit --allow-empty -m foo && + git -C "$server" -c pack.indexVersion=1 gc && + + git clone "$HTTPD_URL/dumb/idx-v1.git" && + git -C idx-v1 fsck +' + test_done I raised some other more philosophical issues in the other part of the thread, but assuming the answer is "no, let's do the simplest thing", then I think this approach is OK. I'd also like to see if I can clean things up around parse_pack_index(), whose semantics I'm changing here (and which violates all manner of assumptions that we usually have about packed_git structs). It's used only by the dumb-http code, and I think we want to refactor it a bit so that nobody else is tempted to use it. I'll try to send something out tonight or tomorrow. -Peff