Re: [PATCH] Teach git submodule update to use distributed repositories

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 17, 2008 at 04:07:11PM +0100, Nigel Magnay wrote:
> And it works, but
> 
> $ git pull fred
> $ git submodule update
> 
> Can leave you with problems, because if a submodule wasn't pushed to
> origin, you won't have it available. This is because the commands are
> equivalent to
> 
> $ git pull fred
> for each submodule()
>   cd submodule
>   git fetch origin
>   git checkout <sha1>

Oh! So, only after replying to most of your mail, I have realized what
are you talking about all the time - _just_ this particular failure
mode:

	"Someone pushed out a repository repointing submodules to
	invalid commits, and instead of waiting for the person to fix
	this breakage, we want to do a one-off fetch of all submodules
	from a different repository."

There's nothing else you're trying to solve by this, right?


Now, I think that this is a completely wrong problem to solve. Your
gitweb is going to be broken, everyone has to jump through hoops because
of this, and that all just because of a single mistake. It shouldn't
have _happenned_ in the first place.

So the proper solution for this should be to make an update hook that
will simply not _let_ you push out a tree that's broken like this.
Something like this (completely untested):

die() { echo "$@"; exit 1; }
git rev-list ^$2 $3 | while read commit; do
	git show $commit:.gitmodules >/tmp/gm$$
	git config -f /tmp/gm$$ --get-regexp 'submodule\..*\.path' |
		cut -d ' ' -f 1 |
		sed 's/^.*\.//; s/\..*$//;' |
		while read submodule; do
			path=$(git config -f /tmp/gm$$ "submodule.$submodule.path")
			url=$(git config -f /tmp/gm$$ "submodule.$submodule.url")
			entry=$(git ls-tree $commit "$path")
			[ -n "$entry" ] || die "submodule $submodule points at a non-existing path"
			[ "$(echo "$entry" | cut -d ' ' -f 1)" = "160000" ] || die "submodule $submodule does not point to a gitlink entry"
			
			subcommit="$(echo "$entry" | cut -d ' ' -f 2)"
			urlhash="$(echo "$url" | sha1sum | cut -d ' ' -f 1)"
			# We keep local copies of submodule repositories
			# for commit existence checking
			echo "Please wait, updating $url cache..."
			if [ -d /tmp/ucache/$urlhash ]; then
			        (cd /tmp/ucache/$urlhash && git fetch)
			else
			        git clone --bare "$url" /tmp/ucache/$urlhash
			fi
			[ "$(git --git-dir=/tmp/ucache/$urlhash cat-file -t "$subcommit" 2>/dev/null)" = "commit" ] || die "submodule $submodule does not point at an existing commit"
		done
	done

Comments? If it seems good, it might be worth including in
contrib/hooks/. Maybe even in the default update hook, controlled by
a config option.

All the troubles here stem from the fact that normally, Git will not let
you push any invalid state to the server. This is not completely true in
this case, but we should prevent this behaviour instead of inventing
hacks to work it around.

> Unless each submodule had a [remote] specified for "fred", you'd be
> stuffed. But what you could do is either by passing the right URL, or
> looking at the superproject [remote] for "fred" - i.e: If in the
> superproject you have
> 
> [remote "fred"]
>         url = ssh://git@xxxxxxxxxx/pub/scm/git/workspace/thing/.git
> [submodule "module"]
>         url = ssh://git@repo/pub/scm/git/module.git
> 
> Then the submodule "module" on fred, if it's a working-copy, can be calculated
>        ssh://git@xxxxxxxxxx/pub/scm/git/workspace/thing/module/.git
> 
> If it isn't a WC then you'd have to have a [remote "fred"] in that
> submodule, but I'm thinking that'd be a rare case.

This is ultra-evil. I think assuming things like this is extremely dirty
and not reasonable for a universal code, _unless_ we explicitly decide
that this is a new convention you want to introduce as a recommendation.
But you should've been very clear about this upfront.

_If_ you still insist on the one-off fetches for some reason, I think
it's reasonable to provide your own simple script for your users that
will autogenerate these URLs appropriately for your particular setup.
I don't think there is any real need for a more generic solution.

> I'd assumed (possibly wrongly?) that there was resistance to putting
> any of the submodule logic in things other than git-submodules.

Are you following the thread about submodule support for git mv, git rm?

-- 
				Petr "Pasky" Baudis
GNU, n. An animal of South Africa, which in its domesticated state
resembles a horse, a buffalo and a stag. In its wild condition it is
something like a thunderbolt, an earthquake and a cyclone. -- A. Pierce
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux