Re: [RFC PATCH 2/2] submodule update: continue when a checkout fails

Junio C Hamano <gitster@xxxxxxxxx> · Sun, 15 May 2011 14:26:37 -0700

Fredrik Gustafsson <iveqy@xxxxxxxxx> writes:

> When running git submodule update the submodules are checked out in
> alphabetic order. When an update of a submodule fails because of a
> checkout error, continue to the next submodule and when done with all
> submodules, exit with an error.
>
> We only do this for 'safe' case when a submodule is not marked as
> rebase or merge. When the update of a submodule fails because of a merge
> or rebase, the update will still die immediately to give the user an
> opportunity to resolve any conflicts before continuing.
>
> Since submodule 'b' does not necessarily need to be dependent on
> submodule 'a' this behavior is helpful if we have a lot of submodules.
> For example if some submodules currently experience network problems
> we can securely continue with the other submodules and the user can
> revisit the failed one later on.
>
> It also is helpful if a checkout fails because a submodules working
> directory is dirty. Now the user can cleanup the submodule in question
> and another git submodule update will just update the failed submodule
> instead of all submodules that are ordered alphabetically afterwards.
>
> Signed-off-by: Fredrik Gustafsson <iveqy@xxxxxxxxx>
> Mentored-by: Jens Lehmann <Jens.Lehmann@xxxxxx>
> Mentored-by: Heiko Voigt <hvoigt@xxxxxxxxxx>

Thanks, even though I find that the logic does not flow as smoothly as it
could in the above description, it now has sufficient information. Compared
to the previous round, it is a vast improvement.

I would rephrase it differently though. The problem description at the
highest level should come first.

	"git submodule update" stops at the first error and gives control
	back to the user. Only after the user fixes the problematic
	submodule and runs "git submodule update" again, the second error
	is found. And the user needs to repeat until all the problems are
	found and fixed one by one. This is tedious.

Then give a high level description of the proposed approach.

	Instead, the command can remember which submodules it had trouble
	with, continue updating the ones it can, and report which ones had
	errors at the end. The user can run "git submodule update", find
	all the ones that need minor fixing (e.g. working tree was dirty)
	to fix them in a single pass. Then another "git submodule update"
	can be run to update all.

And follow up with minor details.

	Note that the problematic submodules are skipped only when they
	are to be integrated with a safer value of submodule.<name>.update
	option, namely "checkout". Fixing a failure in a submodule that
	uses "rebase" or "merge" may need an involved conflict resolution
	by the user, and leaving too many submodules in states that need
	resolution would not reduce the mental burden on the user.

Wouldn't that be easier to understand? After all "alphabetical" is not an
essential part of the problem, but "stopping at the first error, with an
arbitrary definition of order of the submodules (which happens to be
alphabetical)" is the problem you are addressing, no?

Expressed that way, it may become clearer that it is dubious that stopping
on "rebase/merge" is something you should unilaterally enforce; instead,
the user may want to be able to configure if the command should skip and
continue in such a case.

>  git-submodule.sh            |   42 ++++++++++++++++++++++++++++++++++++------
>  t/t7406-submodule-update.sh |   29 +++++++++++++++++++++++++++++
>  2 files changed, 65 insertions(+), 6 deletions(-)
>
> diff --git a/git-submodule.sh b/git-submodule.sh
> index bf110e9..02c41c7 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -525,17 +526,46 @@ cmd_update()
>  				;;
>  			esac
>  
> -			(clear_local_git_env; cd "$path" && $command "$sha1") ||
> -			die "Unable to $action '$sha1' in submodule path '$path'"
> -			say "Submodule path '$path': $msg '$sha1'"
> +			if (clear_local_git_env; cd "$path" && $command "$sha1")
> +			then
> +				say "Submodule path '$path': $msg '$sha1'"
> +			else
> +				case $action in
> +				rebase|merge)
> +					die "Unable to $action '$sha1' in submodule path '$path'" 2

Per my comments to your 1/2, this would become

					die_with_status 2 "Unable to $action '$sha1' in" \
				        	"submodule path '$path'"

I think.  And 1/2 would result in something like this:

	die_with_status () {
		status=$1
                shift
                echo >&2 "$*"
                exit $status
	}

        die () {
        	die_with_status 1 "$@"
	}

> +					err="Failed to $action one or more submodule(s)"
> +					continue

See below...

>  		if test -n "$recursive"
>  		then
> +			(clear_local_git_env; cd "$path" && eval cmd_update "$orig_flags")
> +			res=$?
> +			if test $res -gt 0
> +			then
> +				if test $res -eq 1
> +				then
> +					say "Failed to recurse into submodule path '$path'"
> +					continue

Hmm, should this case be err="..."?

> +				else
> +					die "Failed to recurse into submodule path '$path'" $res
> +				fi
> +			fi
>  		fi
>  	done
> +
> +	if test -n "$err"
> +	then
> +		die "$err"

I am not sure if you are properly addressing the original problem that you
are trying to solve.

When the first error stops the user, the user needs to fix the situation
with that submodule and continue to get another error, and then needs to
continue until all the errors are fixed incrementally, which is tedious.
But at least in the current code, when the user gets control back, the
error message clearly says which submodule was troublesome, so the user
knows where to go to fix things to make progress.

But after your change, the user essentially gets "Many submodules were
updated but there were a few that were skipped. Scroll up and read reams
of output to find out which ones they were".  Aren't you converting one
form of tedium with another?

How about making $err variable to accumulate the names/paths of submodules
you had trouble updating:

	punted_submodules= ;# initialization
	...

	if (... $command "$sha1")
        then
        	case "$action" in
                $unsafe_actions)
                	die_with_status 2 "Unable to $action '$sha1' ..." ;;
		*)
                	punted_submodules="$punted_submodules$path "
                        continue ;;
		esac

and formulate the error message here to list all of them, perhaps like:

	if test -n "$punted_modules"
        then
		die "Failed to update some modules: $punted_modules"
	fi

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html