[PATCH v2] revision: add `--ignore-missing-links` user option

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The revision backend is used by multiple porcelain commands such as
git-rev-list(1) and git-log(1). The backend currently supports ignoring
missing links by setting the `ignore_missing_links` bit. This allows the
revision walk to skip any objects links which are missing. Expose this
bit via an `--ignore-missing-links` user option.

A scenario where this option would be used is to find the boundary
objects between different object directories. Consider a repository with
a main object directory (GIT_OBJECT_DIRECTORY) and one or more alternate
object directories (GIT_ALTERNATE_OBJECT_DIRECTORIES). In such a
repository, enabling this option along with the `--boundary` option for
while disabling the alternate object directory allows us to find the
boundary objects between the main and alternate object directory.

Signed-off-by: Karthik Nayak <karthik.188@xxxxxxxxx>
---

Changes from v1:
1. Changes in the commit message and option description to be more specific
and list why and what the changes are.
2. Ensure the new option also works with the existing `--objects` options.
3. More specific testing for boundary commit.

Range diff against v1:

1:  c0a4dca9b0 ! 1:  e3f4d85732 revision: add `--ignore-missing-links` user option
    @@ Commit message
         The revision backend is used by multiple porcelain commands such as
         git-rev-list(1) and git-log(1). The backend currently supports ignoring
         missing links by setting the `ignore_missing_links` bit. This allows the
    -    revision walk to skip any objects links which are missing.
    +    revision walk to skip any objects links which are missing. Expose this
    +    bit via an `--ignore-missing-links` user option.
     
    -    Currently there is no way to use git-rev-list(1) to traverse the objects
    -    of the main object directory (GIT_OBJECT_DIRECTORY) and print the
    -    boundary objects when moving from the main object directory to the
    -    alternate object directories (GIT_ALTERNATE_OBJECT_DIRECTORIES).
    -
    -    By exposing this new flag `--ignore-missing-links`, users can set the
    -    required env variables (GIT_OBJECT_DIRECTORY and
    -    GIT_ALTERNATE_OBJECT_DIRECTORIES) along with the `--boundary` flag to
    -    find the boundary objects between object directories.
    +    A scenario where this option would be used is to find the boundary
    +    objects between different object directories. Consider a repository with
    +    a main object directory (GIT_OBJECT_DIRECTORY) and one or more alternate
    +    object directories (GIT_ALTERNATE_OBJECT_DIRECTORIES). In such a
    +    repository, enabling this option along with the `--boundary` option for
    +    while disabling the alternate object directory allows us to find the
    +    boundary objects between the main and alternate object directory.
     
         Signed-off-by: Karthik Nayak <karthik.188@xxxxxxxxx>
     
    @@ Documentation/rev-list-options.txt: explicitly.
      	the bad input was not given.
      
     +--ignore-missing-links::
    -+	When an object points to another object that is missing, pretend as if the
    -+	link did not exist. These missing links are not written to stdout unless
    -+	the --boundary flag is passed.
    ++	During traversal, if an object that is referenced does not
    ++	exist, instead of dying of a repository corruption, pretend as
    ++	if the reference itself does not exist. Running the command
    ++	with the `--boundary` option makes these missing commits,
    ++	together with the commits on the edge of revision ranges
    ++	(i.e. true boundary objects), appear on the output, prefixed
    ++	with '-'.
     +
      ifndef::git-rev-list[]
      --bisect::
      	Pretend as if the bad bisection ref `refs/bisect/bad`
     
    + ## builtin/rev-list.c ##
    +@@ builtin/rev-list.c: static int finish_object(struct object *obj, const char *name UNUSED,
    + {
    + 	struct rev_list_info *info = cb_data;
    + 	if (oid_object_info_extended(the_repository, &obj->oid, NULL, 0) < 0) {
    +-		finish_object__ma(obj);
    ++		if (!info->revs->ignore_missing_links)
    ++			finish_object__ma(obj);
    + 		return 1;
    + 	}
    + 	if (info->revs->verify_objects && !obj->parsed && obj->type != OBJ_COMMIT)
    +
      ## revision.c ##
     @@ revision.c: static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
      		revs->limited = 1;
    @@ t/t6022-rev-list-alternates.sh (new)
     +test_expect_success 'create repository and alternate directory' '
     +	git init main &&
     +	test_commit_bulk -C main 5 &&
    ++	BOUNDARY_COMMIT=$(git -C main rev-parse HEAD) &&
     +	mkdir alt &&
     +	mv main/.git/objects/* alt &&
     +	GIT_ALTERNATE_OBJECT_DIRECTORIES=$PWD/alt test_commit_bulk --start=6 -C main 5
     +'
     +
    -+# When the alternate odb is provided, all commits are listed.
    ++# when the alternate odb is provided, all commits are listed along with the boundary
    ++# commit.
     +test_expect_success 'rev-list passes with alternate object directory' '
    -+	GIT_ALTERNATE_OBJECT_DIRECTORIES=$PWD/alt test_stdout_line_count = 10 git -C main rev-list HEAD
    ++	GIT_ALTERNATE_OBJECT_DIRECTORIES=$PWD/alt git -C main rev-list HEAD >actual &&
    ++	test_stdout_line_count = 10 cat actual &&
    ++	grep $BOUNDARY_COMMIT actual
     +'
     +
     +# When the alternate odb is not provided, rev-list fails since the 5th commit's
    @@ t/t6022-rev-list-alternates.sh (new)
     +'
     +
     +# With `--ignore-missing-links`, we stop the traversal when we encounter a
    -+# missing link.
    ++# missing link. The boundary commit is not listed as we haven't used the
    ++# `--boundary` options.
     +test_expect_success 'rev-list only prints main odb commits with --ignore-missing-links' '
    -+	test_stdout_line_count = 5 git -C main rev-list --ignore-missing-links HEAD
    ++	git -C main rev-list --ignore-missing-links HEAD >actual &&
    ++	test_stdout_line_count = 5 cat actual &&
    ++	! grep -$BOUNDARY_COMMIT actual
     +'
     +
     +# With `--ignore-missing-links` and `--boundary`, we can even print those boundary
     +# commits.
     +test_expect_success 'rev-list prints boundary commit with --ignore-missing-links' '
    -+	git -C main rev-list --ignore-missing-links --boundary HEAD >list-output &&
    -+	test_stdout_line_count = 6 cat list-output &&
    -+	test_stdout_line_count = 1 cat list-output | grep "^-"
    ++	git -C main rev-list --ignore-missing-links --boundary HEAD >actual &&
    ++	test_stdout_line_count = 6 cat actual &&
    ++	grep -$BOUNDARY_COMMIT actual
    ++'
    ++
    ++# The `--ignore-missing-links` option should ensure that git-rev-list(1) doesn't
    ++# fail when used alongside `--objects` when a tree is missing.
    ++test_expect_success 'rev-list --ignore-missing-links works with missing tree' '
    ++	echo "foo" >main/file &&
    ++	git -C main add file &&
    ++	GIT_ALTERNATE_OBJECT_DIRECTORIES=$PWD/alt git -C main commit -m"commit 11" &&
    ++	TREE_OID=$(git -C main rev-parse HEAD^{tree}) &&
    ++	mkdir alt/${TREE_OID:0:2} &&
    ++	mv main/.git/objects/${TREE_OID:0:2}/${TREE_OID:2} alt/${TREE_OID:0:2}/ &&
    ++	git -C main rev-list --ignore-missing-links --objects HEAD >actual &&
    ++	! grep $TREE_OID actual
    ++'
    ++
    ++# Similar to above, it should also work when a blob is missing.
    ++test_expect_success 'rev-list --ignore-missing-links works with missing blob' '
    ++	echo "bar" >main/file &&
    ++	git -C main add file &&
    ++	GIT_ALTERNATE_OBJECT_DIRECTORIES=$PWD/alt git -C main commit -m"commit 12" &&
    ++	BLOB_OID=$(git -C main rev-parse HEAD:file) &&
    ++	mkdir alt/${BLOB_OID:0:2} &&
    ++	mv main/.git/objects/${BLOB_OID:0:2}/${BLOB_OID:2} alt/${BLOB_OID:0:2}/ &&
    ++	git -C main rev-list --ignore-missing-links --objects HEAD >actual &&
    ++	! grep $BLOB_OID actual
     +'
     +
     +test_done


 Documentation/rev-list-options.txt |  9 ++++
 builtin/rev-list.c                 |  3 +-
 revision.c                         |  2 +
 t/t6022-rev-list-alternates.sh     | 75 ++++++++++++++++++++++++++++++
 4 files changed, 88 insertions(+), 1 deletion(-)
 create mode 100755 t/t6022-rev-list-alternates.sh

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index a4a0cb93b2..8ee713db3d 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -227,6 +227,15 @@ explicitly.
 	Upon seeing an invalid object name in the input, pretend as if
 	the bad input was not given.
 
+--ignore-missing-links::
+	During traversal, if an object that is referenced does not
+	exist, instead of dying of a repository corruption, pretend as
+	if the reference itself does not exist. Running the command
+	with the `--boundary` option makes these missing commits,
+	together with the commits on the edge of revision ranges
+	(i.e. true boundary objects), appear on the output, prefixed
+	with '-'.
+
 ifndef::git-rev-list[]
 --bisect::
 	Pretend as if the bad bisection ref `refs/bisect/bad`
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index ff715d6918..5239d83c76 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -266,7 +266,8 @@ static int finish_object(struct object *obj, const char *name UNUSED,
 {
 	struct rev_list_info *info = cb_data;
 	if (oid_object_info_extended(the_repository, &obj->oid, NULL, 0) < 0) {
-		finish_object__ma(obj);
+		if (!info->revs->ignore_missing_links)
+			finish_object__ma(obj);
 		return 1;
 	}
 	if (info->revs->verify_objects && !obj->parsed && obj->type != OBJ_COMMIT)
diff --git a/revision.c b/revision.c
index 2f4c53ea20..cbfcbf6e28 100644
--- a/revision.c
+++ b/revision.c
@@ -2595,6 +2595,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->limited = 1;
 	} else if (!strcmp(arg, "--ignore-missing")) {
 		revs->ignore_missing = 1;
+	} else if (!strcmp(arg, "--ignore-missing-links")) {
+		revs->ignore_missing_links = 1;
 	} else if (opt && opt->allow_exclude_promisor_objects &&
 		   !strcmp(arg, "--exclude-promisor-objects")) {
 		if (fetch_if_missing)
diff --git a/t/t6022-rev-list-alternates.sh b/t/t6022-rev-list-alternates.sh
new file mode 100755
index 0000000000..08d9ffde5f
--- /dev/null
+++ b/t/t6022-rev-list-alternates.sh
@@ -0,0 +1,75 @@
+#!/bin/sh
+
+test_description='handling of alternates in rev-list'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+# We create 5 commits and move them to the alt directory and
+# create 5 more commits which will stay in the main odb.
+test_expect_success 'create repository and alternate directory' '
+	git init main &&
+	test_commit_bulk -C main 5 &&
+	BOUNDARY_COMMIT=$(git -C main rev-parse HEAD) &&
+	mkdir alt &&
+	mv main/.git/objects/* alt &&
+	GIT_ALTERNATE_OBJECT_DIRECTORIES=$PWD/alt test_commit_bulk --start=6 -C main 5
+'
+
+# when the alternate odb is provided, all commits are listed along with the boundary
+# commit.
+test_expect_success 'rev-list passes with alternate object directory' '
+	GIT_ALTERNATE_OBJECT_DIRECTORIES=$PWD/alt git -C main rev-list HEAD >actual &&
+	test_stdout_line_count = 10 cat actual &&
+	grep $BOUNDARY_COMMIT actual
+'
+
+# When the alternate odb is not provided, rev-list fails since the 5th commit's
+# parent is not present in the main odb.
+test_expect_success 'rev-list fails without alternate object directory' '
+	test_must_fail git -C main rev-list HEAD
+'
+
+# With `--ignore-missing-links`, we stop the traversal when we encounter a
+# missing link. The boundary commit is not listed as we haven't used the
+# `--boundary` options.
+test_expect_success 'rev-list only prints main odb commits with --ignore-missing-links' '
+	git -C main rev-list --ignore-missing-links HEAD >actual &&
+	test_stdout_line_count = 5 cat actual &&
+	! grep -$BOUNDARY_COMMIT actual
+'
+
+# With `--ignore-missing-links` and `--boundary`, we can even print those boundary
+# commits.
+test_expect_success 'rev-list prints boundary commit with --ignore-missing-links' '
+	git -C main rev-list --ignore-missing-links --boundary HEAD >actual &&
+	test_stdout_line_count = 6 cat actual &&
+	grep -$BOUNDARY_COMMIT actual
+'
+
+# The `--ignore-missing-links` option should ensure that git-rev-list(1) doesn't
+# fail when used alongside `--objects` when a tree is missing.
+test_expect_success 'rev-list --ignore-missing-links works with missing tree' '
+	echo "foo" >main/file &&
+	git -C main add file &&
+	GIT_ALTERNATE_OBJECT_DIRECTORIES=$PWD/alt git -C main commit -m"commit 11" &&
+	TREE_OID=$(git -C main rev-parse HEAD^{tree}) &&
+	mkdir alt/${TREE_OID:0:2} &&
+	mv main/.git/objects/${TREE_OID:0:2}/${TREE_OID:2} alt/${TREE_OID:0:2}/ &&
+	git -C main rev-list --ignore-missing-links --objects HEAD >actual &&
+	! grep $TREE_OID actual
+'
+
+# Similar to above, it should also work when a blob is missing.
+test_expect_success 'rev-list --ignore-missing-links works with missing blob' '
+	echo "bar" >main/file &&
+	git -C main add file &&
+	GIT_ALTERNATE_OBJECT_DIRECTORIES=$PWD/alt git -C main commit -m"commit 12" &&
+	BLOB_OID=$(git -C main rev-parse HEAD:file) &&
+	mkdir alt/${BLOB_OID:0:2} &&
+	mv main/.git/objects/${BLOB_OID:0:2}/${BLOB_OID:2} alt/${BLOB_OID:0:2}/ &&
+	git -C main rev-list --ignore-missing-links --objects HEAD >actual &&
+	! grep $BLOB_OID actual
+'
+
+test_done
-- 
2.41.0




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux