Re: partial clone: promisor fetch during push (pack-objects)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> For a partially-cloned repository, push (pack-objects) does a fetch
> for missing objects to do delta compression with.
> 
> Test-case to reproduce, with annotated output:
> https://gist.github.com/rcoup/5593a0627cca52f226580c72743d8e33
> (git v2.30.1.602.g966e671106)

I found it difficult to understand the gist (there were what I think
are unrelated packings and unpackagings of objects, for example), but
what I gleaned is that you did a bare partial clone (thus, no checkout,
so no blobs get fetched at all), wrote a new blob to the repo, manually
created a tree with the ID of that blob and the IDs of some missing
blobs taken from the current HEAD tree, committed, and then pushed that
commit.

> My use case is partially-cloning some large repositories with the
> majority of the blobs not fetched locally, writing some blobs to
> replace some specific paths I don't currently have cloned, committing
> & pushing (while leaving most trees as-is). I didn't expect git to
> attempt promisor fetches during a push :)
> 
> I happened to see it because allowAnySha1InWant wasn't enabled, so it
> printed errors but happily continued and completed the push (after a
> fetch attempt for each object).
> 
> My current workaround is setting `pack.window=0` during push. Looks
> from builtin/pack_objects.c:prepare_pack() that `pack.depth=0` should
> skip it too, but that didn't seem to work.

Ah, thanks for this. So the cause is not that we are unnecessarily
pushing the missing objects, but because they appear in the window when
we do delta compression (just like you said at the start).

> Invoking pack-objects directly with any --missing= value still tries
> to fetch. And regardless, it carries on if the fetches fail. The
> fetches happen at the end of builtin/pack_objects.c:check_object().

To clarify, the "end" here is the call to prefetch_to_pack(), which
indeed bypasses fetch_if_missing (set by some --missing= arguments).

> 1. Feels to me that fetching from a promisor remote is never going to
> be quicker than skipping delta compression during a push. Maybe
> there's a case for doing it during other pack invocations to minimise
> size though?

It makes sense to me that we shouldn't fetch in this situation.

> 2. Seems like a bug that check_object() doesn't honour
> fetch_if_missing and skip the call to prefetch_to_pack().

We do need to fetch if we're trying to pack a missing object. I think
the real bug here is that we shouldn't prefetch if the to_pack entry has
preferred_base set (which means that it will not go into the final
packfile and is just available for delta compression).

> 3. But push doesn't pass --missing= to pack-objects anyway, so that
> wouldn't actually solve the original issue. Should it?

If we fix the bug I described above, I think it's still OK if push
doesn't pass --missing=.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux