Re: [PATCH 03/11] dumb-http: store downloaded pack idx as tempfile

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 25, 2024 at 05:18:51PM -0400, Taylor Blau wrote:

> > -	strbuf_addf(&buf, "%s.temp", sha1_pack_index_name(hash));
> > -	tmp = strbuf_detach(&buf, NULL);
> > +	/*
> > +	 * Don't put this into packs/, since it's just temporary and we don't
> > +	 * want to confuse it with our local .idx files.  We'll generate our
> > +	 * own index if we choose to download the matching packfile.
> 
> I actually think putting it in $GIT_DIR/objects/pack is fine and perhaps
> preferable, since that's where we put existing in-progress temporary
> files. We'll ignore anything that doesn't end with "*.idx", so I think
> it would be fine.

But the new tempfile _does_ end with .idx. And it (kinda) has to,
because that's an assumption made by packed_git: it stores
path/to/foo.pack, and then assumes it can access /path/to/foo.idx as
needed.

I say "kinda" because as I mentioned elsewhere, the prior code cheats a
bit and assumes that we can access the idx file through the original
mmap we made, even if the ".idx" the packed_git mentions does not yet
exist. I _think_ that's pretty foolproof and that we'd never unmap or
close an idx (though we certainly do for packs themselves). But it
doesn't feel like a great thing to rely on.

> I don't feel strongly about it. It just feels a little weird to stick
> these temporary files in one place, and stick other (similar) temporary
> flies in another.

One difference is that these are pure tempfiles. We'll never "finalize"
them by renaming them into place, so there's no chance of having to do a
cross-directory link/rename. They just get created and then deleted[1].

Perhaps something like "objects/temp-remote-index/*.idx" would be a
more descriptive name (and certainly what I'd use if making a more full
caching system), but there are complications:

  1. We don't have good cleanup routines for directories. Would that
     just become a semi-permanent empty directory in a dumb-http repo?

  2. git-prune cleans up tmp_* in objects/ and objects/pack, but would
     have to learn to look there, too.

So just sticking them in objects/ seemed like the path of least
resistance to me.

-Peff

[1] They really could be opened anywhere, like /tmp. Which I was tempted
    to do, but since they're non-trivial in size it seemed like keeping
    them inside the repo directory was the best bet.

    Curiously, if we lean into the "we will never close a mapped idx",
    then we could create them, open and map them, and then delete them
    immediately, which is a somewhat common tempfile idiom. At least on
    Unix systems. I'm not sure how Windows would like that. :)




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux