Re: Bug report: v2.47.0 cannot fetch version 1 pack indexes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 21, 2024 at 04:33:15PM -0400, Taylor Blau wrote:

> > @@ -2388,8 +2389,24 @@ static char *fetch_pack_index(unsigned char *hash, const char *base_url)
> >  	strbuf_addf(&buf, "objects/pack/pack-%s.idx", hash_to_hex(hash));
> >  	url = strbuf_detach(&buf, NULL);
> >
> > -	strbuf_addf(&buf, "%s.temp", sha1_pack_index_name(hash));
> > -	tmp = strbuf_detach(&buf, NULL);
> > +	/*
> > +	 * Don't put this into packs/, since it's just temporary and we don't
> > +	 * want to confuse it with our local .idx files.  We'll generate our
> > +	 * own index if we choose to download the matching packfile.
> > +	 *
> > +	 * It's tempting to use xmks_tempfile() here, but it's important that
> > +	 * the file not exist, otherwise http_get_file() complains. So we
> > +	 * create a filename that should be unique, and then just register it
> > +	 * as a tempfile so that it will get cleaned up on exit.
> > +	 *
> > +	 * Arguably it would be better to hold on to the tempfile handle so
> > +	 * that we can remove it as soon as we download the pack and generate
> > +	 * the real index, but that might need more surgery.
> > +	 */
> > +	tmp = xstrfmt("%s/tmp_pack_%s.idx",
> > +		      repo_get_object_directory(the_repository),
> > +		      hash_to_hex(hash));
> > +	register_tempfile(tmp);
> 
> Makes perfect sense, and the comment above here is much appreciated.
> 
> I thought about trying to use some intermediate state of the strbuf here
> to avoid an extra xstrfmt() call, but couldn't come up with anything I
> didn't think was awkward.

I don't think there's any useful intermediate state. The earlier %s is
the base url, but here it's our local directory.

We could continue to re-use the scratch strbuf as the existing code did
(and which xstrfmt() is doing under the hood). It wasn't really
intentional for me to change that, but I went through a lot of attempts
to get here (using mks_tempfile(), and so on).

> > +static char *pack_path_from_idx(const char *idx_path)
> > +{
> > +	size_t len;
> > +	if (!strip_suffix(idx_path, ".idx", &len))
> > +		BUG("idx path does not end in .idx: %s", idx_path);
> > +	return xstrfmt("%.*s.pack", (int)len, idx_path);
> > +}
> > +
> >  struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
> >  {
> > -	const char *path = sha1_pack_name(sha1);
> > +	char *path = pack_path_from_idx(idx_path);
> 
> Huh. I would have thought we have such a helper function already. I
> guess we probably do, but that it's also defined statically because it's
> so easy to write.

I thought so, too, but couldn't find one. We have pack_bitmap_filename()
(and so on for .rev and .midx files) that goes from .pack to those
extensions. But here we want to go from .idx to .pack. I think most
stuff goes from ".pack" because that's what we store in the packed_git
struct.

There's also sha1_pack_index_name(), but that goes from a csum-file hash
to a filename.

I grepped around and strip_suffix() seems to be par for the course in
similar situations within pack/repack code, so I think it's OK here.

> In any case, this looks like the right thing to do here. It would be
> nice to have a corresponding test here, since unlike the other
> finalize_object_file() changes, this one can be provoked
> deterministically.
> 
> Would you mind submitting this as a bona-fide patch, which I can then
> pick up and start merging down?

Yeah, the test is easy:

diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 58189c9f7d..50a7b98813 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -507,4 +507,14 @@ test_expect_success 'fetching via http alternates works' '
 	git -c http.followredirects=true clone "$HTTPD_URL/dumb/alt-child.git"
 '
 
+test_expect_success 'dumb http can fetch index v1' '
+	server=$HTTPD_DOCUMENT_ROOT_PATH/idx-v1.git &&
+	git init --bare "$server" &&
+	git -C "$server" --work-tree=. commit --allow-empty -m foo &&
+	git -C "$server" -c pack.indexVersion=1 gc &&
+
+	git clone "$HTTPD_URL/dumb/idx-v1.git" &&
+	git -C idx-v1 fsck
+'
+
 test_done

I raised some other more philosophical issues in the other part of the
thread, but assuming the answer is "no, let's do the simplest thing",
then I think this approach is OK.

I'd also like to see if I can clean things up around parse_pack_index(),
whose semantics I'm changing here (and which violates all manner of
assumptions that we usually have about packed_git structs). It's used
only by the dumb-http code, and I think we want to refactor it a bit so
that nobody else is tempted to use it.

I'll try to send something out tonight or tomorrow.

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux