Re: git submodules implementation question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

> Stefan Beller <sbeller@xxxxxxxxxx> writes:
>
>>> The final version needs to be accompanied with tests to show the
>>> effect of this change for callers.  A test would set up a top-level
>>> and submodule, deliberately break submodule/.git/ repository and
>>> show what breaks and how without this change.
>>
>> Tests are really good at providing this context as well, or to communicate
>> the actual underlying problem, which is not quite clear to me.
>> That is why I refrained from jumping into the discussion as I think the
>> first few emails were dropped from the mailing list and I am missing context.
>
> I do not know where you started reading, but the gist of it is that
> submodule.c spawns subprocess to run in the submodule's context by
> assuming that chdir'ing into the <path> of the submodule and running
> it (i.e. cp.dir set to <path> to drive start_command(&cp)) is
> sufficient.  When <path>/.git (either it is a directory itself or it
> points at a directory in .git/module/<name> in the superproject) is
> a corrupt repository, running "git -C <path> command" would try to
> auto-detect the repository, because it thinks <path>/.git is not a
> repository and it thinks it is not at the top-level of the working
> tree, and instead finds the repository of the top-level, which is
> almost never what we want.

This is with a test that covers the call in get_next_submodule() for
the parallel fetch callback.  I think many of the codepaths will end
up recursing forever the same way without the fix in a submodule
repository that is broken in a similar way, but I didn't check, so
I do not consider this to be completed.

-- >8 --
Subject: submodule: avoid auto-discovery in prepare_submodule_repo_env()

The function is used to set up the environment variable used in a
subprocess we spawn in a submodule directory.  The callers set up a
child_process structure, find the working tree path of one submodule 
and set .dir field to it, and then use start_command() API to spawn
the subprocess like "status", "fetch", etc.

When this happens, we expect that the ".git" (either a directory or
a gitfile that points at the real location) in the current working
directory of the subprocess MUST be the repository for the submodule.

If this ".git" thing is a corrupt repository, however, because
prepare_submodule_repo_env() unsets GIT_DIR and GIT_WORK_TREE, the
subprocess will see ".git", thinks it is not a repository, and
attempt to find one by going up, likely to end up in finding the
repository of the superproject.  In some codepaths, this will cause
a command run with the "--recurse-submodules" option to recurse
forever.

By exporting GIT_DIR=.git, disable the auto-discovery logic in the
subprocess, which would instead stop it and report an error.

Not-signed-off-yet.
---
 submodule.c                 |  1 +
 t/t5526-fetch-submodules.sh | 29 +++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/submodule.c b/submodule.c
index 1b5cdfb..e8258f0 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1160,4 +1160,5 @@ void prepare_submodule_repo_env(struct argv_array *out)
 		if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
 			argv_array_push(out, *var);
 	}
+	argv_array_push(out, "GIT_DIR=.git");
 }
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index 954d0e4..b2dee30 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -485,4 +485,33 @@ test_expect_success 'fetching submodules respects parallel settings' '
 	)
 '
 
+test_expect_success 'fetching submodule into a broken repository' '
+	# Prepare src and src/sub nested in it
+	git init src &&
+	(
+		cd src &&
+		git init sub &&
+		git -C sub commit --allow-empty -m "initial in sub" &&
+		git submodule add -- ./sub sub &&
+		git commit -m "initial in top"
+	) &&
+
+	# Clone the old-fashoned way
+	git clone src dst &&
+	git -C dst clone ../src/sub sub &&
+
+	# Make sure that old-fashoned layout is still supported
+	git -C dst status &&
+
+	# Recursive-fetch works fine
+	git -C dst fetch --recurse-submodules &&
+
+	# Break the receiving submodule
+	rm -f dst/sub/.git/HEAD &&
+
+	# Recursive-fetch must terminate
+	# NOTE: without fix this will recurse forever!
+	test_must_fail git -C dst fetch --recurse-submodules
+'
+
 test_done





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]