Re: "bad revision" fetch error when fetching missing objects from partial clones

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[+cc Patrick, who also worked in this area recently and is probably
interested to hear about this bug]

On Thu, May 13, 2021 at 05:57:57AM -0400, Jeff King wrote:

> I think it is actually a bug with pack-objects not sending the object,
> but it only seems to trigger with bitmaps. This works:
> 
>   git init repo
>   cd repo
> 
>   echo content >file
>   git add file
>   git commit -m base
> 
>   git config uploadpack.allowfilter true
>   git clone --no-local --bare --filter=blob:none . clone
> 
>   cd clone
>   git fetch origin $(git rev-parse HEAD:file)
> 
> But if I add a "git repack -adb" in the parent repository before the "cd
> clone", then I get:
> 
>   remote: Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
>   fatal: bad revision 'd95f3ad14dee633a758d2e331151e950dd13e4ed'
>   error: /home/peff/tmp/repo/. did not send all necessary objects
> 
> So presumably this is a bug in the bitmap-aware filtering code that is
> not present in the regular filter-traversing code. But what really
> puzzles me is that the result seems totally broken. Yet the test
> coverage in t5310 passes, and nobody has noticed on major sites like
> GitHub (which supports partial clones and most certainly has bitmaps
> enabled).
> 
> So I think it will require some digging. My _guess_ is that it has to do
> with the "never filter out an object that was explicitly requested" rule
> not being consistently followed. But we'll see.

OK, I think I've unraveled the mysteries.

It is indeed a problem with the "never filter out an object that was
explicitly requested" rule. But really, that rule is stronger: it is
"always include an explicitly requested object". I.e., it must override
even the usual "you said commit X was uninteresting, so we did not
include objects reachable from X" logic.

And that is what was happening here. In that final fetch, the client
says: "I have the commit pointed to by HEAD, but I need the blob from
HEAD:file". So the server's bitmap traversal, even before we hit the
blob:none filtering, has cleared the bit for HEAD:file's object! So
there is nothing to filter, but the filtering code must _add it back in_
to the result.

And indeed, if we are adding it back in anyway, we do not need to
exclude it when filtering in the first place. So I think the fix looks
something like this:

diff --git a/pack-bitmap.c b/pack-bitmap.c
index d90e1d9d8c..8458eedbdb 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -797,8 +797,6 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git,
 	for (i = 0, init_type_iterator(&it, bitmap_git, type);
 	     i < to_filter->word_alloc && ewah_iterator_next(&mask, &it);
 	     i++) {
-		if (i < tips->word_alloc)
-			mask &= ~tips->words[i];
 		to_filter->words[i] &= ~mask;
 	}
 
@@ -810,11 +808,18 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git,
 	for (i = 0; i < eindex->count; i++) {
 		uint32_t pos = i + bitmap_git->pack->num_objects;
 		if (eindex->objects[i]->type == type &&
-		    bitmap_get(to_filter, pos) &&
-		    !bitmap_get(tips, pos))
+		    bitmap_get(to_filter, pos))
 			bitmap_unset(to_filter, pos);
 	}
 
+	/*
+	 * Add back in any bits that were explicitly asked for (we may have
+	 * cleared them as part of our filtering just now, or they may
+	 * have been dropped by because they were reachable from the "have"
+	 * side of the bitmap).
+	 */
+	bitmap_or(to_filter, tips);
+
 	bitmap_free(tips);
 }
 
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index b02838750e..d34f086228 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -214,6 +214,16 @@ test_expect_success 'partial clone from bitmapped repository' '
 	)
 '
 
+test_expect_success 'follow-up fetch from bitmapped repository' '
+	test_config uploadpack.allowfilter true &&
+	(
+		cd partial-clone.git &&
+		blob=$(git rev-parse HEAD:file-1.t) &&
+		git fetch origin "$blob" &&
+		git cat-file -t "$blob"
+	)
+'
+
 test_expect_success 'setup further non-bitmapped commits' '
 	test_commit_bulk --id=further 10
 '

That fixes the example I showed above, as well as the included test in
t5310. Some follow-on thoughts:

  - this covers blob:none, and indeed any type-exclusion filter, as they
    share this code. But we probably need a similar fix for other
    filters (like the blob-limit filter). I think this could even be
    hoisted out to the filter_bitmap() function, and then we only have
    to do it in one place.

  - alternatively, the actual bitmap traversal could be smart enough to
    say "this is an explicitly-requested tip object; make sure we
    _don't_ mark it as uninteresting". Which is exactly what the
    non-bitmap traversal code does (I think this is the USER_GIVEN
    stuff). And we even try to build on that code by passing the
    list_objects_filter struct along, but:

      - that is only for fill-in traversal; an on-disk bitmap means we
	might touch those bits without calling into the list-objects
	code at all.

      - we don't feed the list-objects code all the information it needs
	anyway. We do two separate traversals, one for the "haves"
	bitmap and one for the "wants", and then AND-NOT the former into
	the latter. So when we do that "haves" traversal, those objects
	are _not_ being marked as UNINTERESTING. They are considered
	positive tips from the perspective of the traversal code.

     So I think we really do need to manipulate the bits in the bitmap
     to get the right answer. But it seems like there's a big
     optimization opportunity we're missing here. If the other side says
     "I don't have this commit, but I need this blob" (as our examples
     do), then there is no point in figuring out the "haves" bitmap at
     all! We know that it will not be used. In practice I'm not sure how
     big a deal this is (if the other side asked for any commit, we'd
     need it, and in theory it's quick to compute because of the on-disk
     bitmaps). At any rate, I think we should focus on correctness first
     here.

  - as to the mystery of why nobody has noticed this breakage: it's
    because the usual code to fetch-on-demand an object that we are
    missing _doesn't_ send any other "we already have this commit" tips.
    It just says "hey, give me these objects". So a workaround in the
    meantime is:

      git -c fetch.negotiationAlgorithm=noop fetch origin ...

I'm going to sleep on it, but I think the fix above is the right
direction. I just need to look at the other filters, and add a bit of
polish.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux