Re: "bad revision" fetch error when fetching missing objects from partial clones

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[+cc Jonathan Tan for partial clone wisdom]

On Thu, May 13, 2021 at 06:53:36AM -0400, Jeff King wrote:

> > So I think it will require some digging. My _guess_ is that it has to do
> > with the "never filter out an object that was explicitly requested" rule
> > not being consistently followed. But we'll see.
> 
> OK, I think I've unraveled the mysteries.

Nope. This part is wrong:

> It is indeed a problem with the "never filter out an object that was
> explicitly requested" rule. But really, that rule is stronger: it is
> "always include an explicitly requested object". I.e., it must override
> even the usual "you said commit X was uninteresting, so we did not
> include objects reachable from X" logic.

The rule really is the softer "don't filter out explicitly-requested
objects". It's just that the non-bitmap traversal code is way less
careful about finding uninteresting objects.

Here's the same scenario failing without using bitmaps at all:

  # same as before; repo with one commit
  git init repo
  cd repo
  
  echo content >file
  git add file
  git commit -m base

  # and now we make a partial clone omitting the blob
  git config uploadpack.allowfilter true
  git clone --no-local --bare --filter=blob:none . clone

  # but here's the twist. Now we make a second commit...
  echo more >file
  git commit -am more

  # ...and then we fetch both the object we need _and_ that second
  # commit. That causes pack-objects to traverse from base..more.
  # The boundary is at "base", so we mark its tree and blob as
  # UNINTERESTING, and thus we _don't_ send them.
  cd clone
  git fetch origin $(git rev-parse HEAD:file) HEAD

So I guess the first question is: is this supposed to work? Without
bitmaps, it often will. Because we walk commits first, and then only
mark trees uninteresting at the boundary; so if there were more commits
here, and we were asking to get a blob from one of the middle ones, it
would probably work. But fundamentally the client is lying to the server
here (as all partial clones must); it is saying "I have that first
commit", but of course we don't have all of the reachable objects.

If this is supposed to work, I think we need to teach the traversal code
to "add back" all of the objects that were explicitly given when a
filter is in use (either explicitly, or perhaps just clearing or
avoiding the UNINTERESTING flag on user-given objects in the first
place). And my earlier patch does that for the bitmap side, but not the
regular traversal.

Alternatively, part of that fundamental "partial clones are lying about
their reachability" property is that we assume they can fetch the
objects they need on-demand. And indeed, the on-demand fetch we'd do
will use the noop negotiation algorithm, and will succeed. So should we,
after receiving an empty pack from the other side (because we claimed to
have objects reachable from the commit), then do a follow-up on-demand
fetch? If so, then why isn't that kicking in (I can guess there might be
some limits to on-demand fetching during a fetch itself, in order to
avoid infinite recursion)?

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux