On 07.12.20 19:42, Philippe Blain wrote:
Hi Peter,
Le 7 déc. 2020 à 08:46, Peter Kaestle <peter.kaestle@xxxxxxxxx> a écrit :
A regression has been introduced by a62387b (submodule.c: fetch in
submodules git directory instead of in worktree, 2018-11-28).
The scenario in which it triggers is when one has a remote repository
with a subrepository inside a subrepository like this:
superproject/middle_repo/inner_repo
The correct terminology is "submodule", not "subrepository".
Also, (minor point) I would just write "when one has a repository",
as its simpler (the repository by itself is not "remote", it is only "remote"
in relation the repositories that are cloned from it).
ok.
Person A and B have both a clone of it, while Person B is not working
with the inner_repo and thus does not have it initialized in his working
copy.
Now person A introduces a change to the inner_repo and propagates it
through the middle_repo and the superproject.
Once person A pushed the changes and person B wants to fetch them using
"git fetch" on superproject level,
s/on/at the/
ok.
B's git call will return with error
saying:
Could not access submodule 'inner_repo'
Errors during submodule fetch:
middle_repo
Expectation is that in this case the inner submodule will be recognized
as uninitialized subrepository and skipped by the git fetch command.
here again, terminology: "as an uninitialized submodule"
ok.
This used to work correctly before 'a62387b (submodule.c: fetch in
submodules git directory instead of in worktree, 2018-11-28)'.
Starting with a62387b the code wants to evaluate "is_empty_dir()" inside
.git/modules for a directory only existing in the worktree, delivering
then of course wrong return value.
This patch ensures is_empty_dir() is getting the correct path of the
uninitialized submodule by concatenation of the actual worktree and the
name of the uninitialized submodule.
Furthermore a regression test case is added, which tests for recursive
fetches on a superproject with uninitialized sub repositories.
This
issue was leading to an infinite loop when doing a revert of a62387b.
I would maybe add more details here, something like the following
(we can cite your previous attempt, because it was merged to 'master'):
The first attempt to fix this regression, in 1b7ac4e6d4 (submodules:
fix of regression on fetching of non-init subsub-repo, 2020-11-12), by simply
reverting a62387b, resulted in
an infinite loop of submodule fetches in the simpler case of a recursive fetch of a superproject with
uninitialized submodules, and so this commit was reverted in 7091499bc0 (Revert
"submodules: fix of regression on fetching of non-init subsub-repo", 2020-12-02).
To prevent future breakages, also add a regression test for this scenario.
Jip, I like that.
Signed-off-by: Peter Kaestle <peter.kaestle@xxxxxxxxx>
CC: Junio C Hamano <gitster@xxxxxxxxx>
CC: Philippe Blain <levraiphilippeblain@xxxxxxxxx>
CC: Ralf Thielow <ralf.thielow@xxxxxxxxx>
CC: Eric Sunshine <sunshine@xxxxxxxxxxxxxx>
---
submodule.c | 7 ++-
t/t5526-fetch-submodules.sh | 104 ++++++++++++++++++++++++++++++++++++
2 files changed, 110 insertions(+), 1 deletion(-)
diff --git a/submodule.c b/submodule.c
index b3bb59f066..b561445329 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1477,6 +1477,7 @@ static int get_next_submodule(struct child_process *cp,
strbuf_release(&submodule_prefix);
return 1;
} else {
+ struct strbuf empty_submodule_path = STRBUF_INIT;
fetch_task_release(task);
free(task);
@@ -1485,13 +1486,17 @@ static int get_next_submodule(struct child_process *cp,
* An empty directory is normal,
* the submodule is not initialized
*/
+ strbuf_addf(&empty_submodule_path, "%s/%s/",
+ spf->r->worktree,
+ ce->name);
if (S_ISGITLINK(ce->ce_mode) &&
- !is_empty_dir(ce->name)) {
+ !is_empty_dir(empty_submodule_path.buf)) {
spf->result = 1;
strbuf_addf(err,
_("Could not access submodule '%s'\n"),
ce->name);
}
+ strbuf_release(&empty_submodule_path);
}
}
Maybe a personal preference, but I would have gone for something a little simpler, like the following:
diff --git a/submodule.c b/submodule.c
index b3bb59f066..4200865174 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1486,7 +1486,7 @@ static int get_next_submodule(struct child_process *cp,
* the submodule is not initialized
*/
if (S_ISGITLINK(ce->ce_mode) &&
- !is_empty_dir(ce->name)) {
+ !is_empty_dir(repo_worktree_path(spf->r, "%s", ce->name))) {
I'm not deep enough into the git code to judge which approach is the
better one. From my perspective, being a foreigner to the git code, I
like my proposed code more, as for me it's much easier to understand
what's happening by having a meaningful variable name and without being
forced to dig into outer functions first.
Also Junio C Hamano <gitster@xxxxxxxxx> is having some concerns, which I
can't judge:
But then you leak the return value from repo_worktree_path(), no?
Thus for v3 I'll stick to my proposal and when you'll review it, please
discuss with each other whether I should go for a v4 using
repo_worktree_path().
[...]
+
+test_expect_success 'setup recursive fetch with uninit submodule' '
+ # does not depend on any previous test setups
+
+ git init main &&
+ git init sub &&
+
+ >sub/file &&
+ git -C sub add file &&
+ git -C sub commit -m "add file" &&
+ git -C sub rev-parse HEAD >expect &&
+
+ git -C main submodule add ../sub &&
+ git -C main submodule init &&
+ git -C main submodule update --checkout &&
These two steps are unnecessary as they are implicitly done by 'git submodule add'.
I think we could reflect real life a little bit more by cloning the superproject, and running
the 'recursive fetch with uninit submodule' test below in the clone.
Yes, you're right, "...init" and "...update..." can be removed.
+ git -C main submodule status >out &&
+ sed -e "s/^ //" -e "s/ sub .*$//" out >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'recursive fetch with uninit submodule' '
+ # depends on previous test for setup
+
+ git -C main submodule deinit -f sub &&
Here you are deiniting the submodule, such that
the Git directory will stay in .git/modules/sub. This is not the same thing
as a submodule that was never initialized ("uninitialized"), for which .git/modules/sub
will not yet exist. So maybe we could harden the tests by also testing
for that scenario ? I don't know... maybe the infinite loop only happens
if .git/modules/sub actually already exists. If so, the test name should be
"recursive fetch with deinitialized submodule", I think.
I added another test case for v3, which checks for this in case of never
initialized submodule. When executing the test, I can see that the
infinite loop regression only occurs after doing the init followed by a
deinit. Thus renaming the test accordingly.
--
best regards
--peter;