[PATCH v2 0/2] When fetching, die if in commit graph but not obj db

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks everyone for your review comments. I've updated the patch 1
commit message as Josh requested. I'll reply individually to comments on
patch 2.

Jonathan Tan (2):
  Revert "fetch-pack: add a deref_without_lazy_fetch_extended()"
  fetch-pack: warn if in commit graph but not obj db

 fetch-pack.c                               | 42 +++++++++++-----------
 t/t5330-no-lazy-fetch-with-commit-graph.sh |  2 +-
 2 files changed, 23 insertions(+), 21 deletions(-)

Range-diff against v1:
1:  4dea8933cf ! 1:  34e87b8388 Revert "fetch-pack: add a deref_without_lazy_fetch_extended()"
    @@ Commit message
     
         This reverts commit a6e65fb39caf18259c660c1c7910d5bf80bc15cb.
     
    +    This revert simplifies the next patch in this patch set.
    +
         The commit message of that commit mentions that the new function "will
         be used for the bundle-uri client in a subsequent commit", but it seems
         that eventually it wasn't used.
2:  1027ff2cb7 ! 2:  631b9a8677 fetch-pack: warn if in commit graph but not obj db
    @@ Commit message
         action, because the object is present in the commit graph file but not in
         the object DB.
     
    -    Therefore, detect when this occurs and print a warning. (Note that
    -    we cannot proceed to include this object in the list of objects to
    -    be fetched without changing at least the fetch negotiation code:
    -    what would happen is that the client will send "want X" and "have X"
    -    and when I tested at $DAYJOB with a work server that uses JGit, the
    -    server reasonably returned an empty packfile. And changing the fetch
    -    negotiation code to only use the object DB when deciding what to report
    -    as "have" would be an unnecessary slowdown, I think.)
    +    Therefore, make it a fatal error when this occurs. (Note that we cannot
    +    proceed to include this object in the list of objects to be fetched
    +    without changing at least the fetch negotiation code: what would happen
    +    is that the client will send "want X" and "have X" and when I tested
    +    at $DAYJOB with a work server that uses JGit, the server reasonably
    +    returned an empty packfile. And changing the fetch negotiation code to
    +    only use the object DB when deciding what to report as "have" would be
    +    an unnecessary slowdown, I think.)
     
         This was discovered when a lazy fetch of a missing commit completed with
         nothing actually fetched, and the writing of the commit graph file after
         every fetch then attempted to read said missing commit, triggering a
         lazy fetch of said missing commit, resulting in an infinite loop with no
         user-visible indication (until they check the list of processes running
    -    on their computer). With this fix, at least a warning message will be
    -    printed. Note that although the repo corruption we discovered was caused
    -    by a bug in GC in a partial clone, the behavior that this patch teaches
    -    Git to warn about applies to any repo with commit graph enabled and with
    -    a missing commit, whether it is a partial clone or not.
    +    on their computer). With this fix, there is no infinite loop. Note that
    +    although the repo corruption we discovered was caused by a bug in GC in
    +    a partial clone, the behavior that this patch teaches Git to warn about
    +    applies to any repo with commit graph enabled and with a missing commit,
    +    whether it is a partial clone or not.
    +
    +    t5330, introduced in 3a1ea94a49 (commit-graph.c: no lazy fetch in
    +    lookup_commit_in_graph(), 2022-07-01), tests that an interaction between
    +    fetch and the commit graph does not cause an infinite loop. This patch
    +    changes the exit code in that situation, so that test had to be changed.
     
         Signed-off-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx>
     
      ## fetch-pack.c ##
    -@@ fetch-pack.c: static struct string_list uri_protocols = STRING_LIST_INIT_DUP;
    - #define ALTERNATE	(1U << 1)
    - #define COMMON		(1U << 6)
    - #define REACH_SCRATCH	(1U << 7)
    -+#define COMPLETE_FROM_COMMIT_GRAPH	(1U << 8)
    - 
    - /*
    -  * After sending this many "have"s if we do not get any new ACK , we
     @@ fetch-pack.c: static void for_each_cached_alternate(struct fetch_negotiator *negotiator,
    + 		cb(negotiator, cache.items[i]);
      }
      
    ++static void die_in_commit_graph_only(const struct object_id *oid)
    ++{
    ++	die(_("You are attempting to fetch %s, which is in the commit graph file but not in the object database.\n"
    ++	      "This is probably due to repo corruption.\n"
    ++	      "If you are attempting to repair this repo corruption by refetching the missing object, use 'git fetch --refetch' with the missing object."),
    ++	      oid_to_hex(oid));
    ++}
    ++
      static struct commit *deref_without_lazy_fetch(const struct object_id *oid,
     -					       int mark_tags_complete)
    -+					       int mark_additional_complete_information)
    ++					       int mark_tags_complete_and_check_obj_db)
      {
      	enum object_type type;
      	struct object_info info = { .typep = &type };
    @@ fetch-pack.c: static void for_each_cached_alternate(struct fetch_negotiator *neg
      	commit = lookup_commit_in_graph(the_repository, oid);
     -	if (commit)
     +	if (commit) {
    -+		if (mark_additional_complete_information)
    -+			commit->object.flags |= COMPLETE_FROM_COMMIT_GRAPH;
    ++		if (mark_tags_complete_and_check_obj_db) {
    ++			if (!has_object(the_repository, oid, 0))
    ++				die_in_commit_graph_only(oid);
    ++		}
      		return commit;
     +	}
      
    @@ fetch-pack.c: static struct commit *deref_without_lazy_fetch(const struct object
      			if (!tag->tagged)
      				return NULL;
     -			if (mark_tags_complete)
    -+			if (mark_additional_complete_information)
    ++			if (mark_tags_complete_and_check_obj_db)
      				tag->object.flags |= COMPLETE;
      			oid = &tag->tagged->oid;
      		} else {
    -@@ fetch-pack.c: static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
    - 	save_commit_buffer = old_save_commit_buffer;
    - }
    - 
    -+static void warn_in_commit_graph_only(const struct object_id *oid)
    -+{
    -+	warning(_("You are attempting to fetch %s, which is in the commit graph file but not in the object database."),
    -+		oid_to_hex(oid));
    -+	warning(_("This is probably due to repo corruption."));
    -+	warning(_("If you are attempting to repair this repo corruption by refetching the missing object, use 'git fetch --refetch' with the missing object."));
    -+}
    -+
    - /*
    -  * Returns 1 if every object pointed to by the given remote refs is available
    -  * locally and reachable from a local ref, and 0 otherwise.
    -@@ fetch-pack.c: static int everything_local(struct fetch_pack_args *args,
    - 				      ref->name);
    - 			continue;
    - 		}
    -+		if (o->flags & COMPLETE_FROM_COMMIT_GRAPH) {
    -+			if (!has_object(the_repository, remote, 0))
    -+				warn_in_commit_graph_only(remote);
    -+		}
    - 		print_verbose(args, _("already have %s (%s)"), oid_to_hex(remote),
    - 			      ref->name);
    - 	}
     
    - ## object.h ##
    -@@ object.h: void object_array_init(struct object_array *array);
    - /*
    -  * object flag allocation:
    -  * revision.h:               0---------10         15               23------27
    -- * fetch-pack.c:             01    67
    -+ * fetch-pack.c:             01    6-8
    -  * negotiator/default.c:       2--5
    -  * walker.c:                 0-2
    -  * upload-pack.c:                4       11-----14  16-----19
    + ## t/t5330-no-lazy-fetch-with-commit-graph.sh ##
    +@@ t/t5330-no-lazy-fetch-with-commit-graph.sh: test_expect_success 'fetch any commit from promisor with the usage of the commit
    + 	test_commit -C with-commit any-commit &&
    + 	anycommit=$(git -C with-commit rev-parse HEAD) &&
    + 	GIT_TRACE="$(pwd)/trace.txt" \
    +-		git -C with-commit-graph fetch origin $anycommit 2>err &&
    ++		test_must_fail git -C with-commit-graph fetch origin $anycommit 2>err &&
    + 	! grep "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
    + 	grep "git fetch origin" trace.txt >actual &&
    + 	test_line_count = 1 actual
-- 
2.47.0.163.g1226f6d8fa-goog





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux