Re: [PATCH 2/2] connect: know that zero-ID is not a ref

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Jonathan Tan wrote:

> connect.c, when processing packfiles, treats a zero ID (with
> `capabilities^{}` in place of the refname) as an actual ref instead of a
> placeholder for a capability declaration, contrary to the specification
> in Reference Discovery in Documentation/technical/pack-protocol.txt.
> This is an issue when interacting with Git implementations that follow
> this specification. For example, `ls-remote` (against a git://
> repository served by such a Git implementation) will report a ref when
> there are none, and `clone` (against something similar) will fail its
> checkout.

For example, this means I get the following confusing message when
cloning an empty repository served by "jgit daemon":

 $ git clone https://googlers.googlesource.com/jrn/empty
 Cloning into 'empty'...
 Checking connectivity... done.
 warning: remote HEAD refers to nonexistent ref, unable to checkout.

 $

It also means that standard "git daemon" is not able to advertise
capabilities on a fetch from a repository without advertising some
refs, so I cannot fetch-by-sha1 from a C git server unless some refs
are advertised.

This fix should solve the former problem and paves the way for the
latter to be solved once a year or two passes and people's clients are
up to date (or earlier if a server operator chooses).

> Make connect.c follow the specification with respect to this, while
> maintaining compatibility with existing implementations that do not
> serve the zero ID when a repository has no refs.

This description focuses on the code.  When deciding whether to apply
the patch (or whether to revert it if it comes up while trying to
investigate another problem), I am likely to wonder "what effect will
the patch have on me?" instead of "what standards does it follow?"

Flipping the emphasis, we could say something like

	connect: treat ref advertisement with capabilities^{} line as one with no refs

	When cloning an empty repository served by standard git, "git clone"
	produces the following reassuring message:

		$ git clone git://localhost/tmp/empty
		Cloning into 'empty'...
		warning: You appear to have cloned an empty repository.
		Checking connectivity... done.
	
	Meanwhile when cloning an empty repository served by JGit, the output
	is more haphazard:

		$ git clone git://localhost/tmp/empty
		Cloning into 'empty'...
		Checking connectivity... done.
		warning: remote HEAD refers to nonexistent ref, unable to checkout.

	This is a common command to run immediately after creating a remote
	repository as preparation for adding content to populate it and pushing.
	The warning is confusing and needlessly worrying.

	The cause is that, since v3.1.0.201309270735-rc1~22 (Advertise capabilities
	with no refs in upload service., 2013-08-08), JGit's ref advertisement
	includes a ref named capabilities^{} to advertise its refs on, while git's
	ref advertisement is empty in this case.  This allows the client to learn
	about the server's capabilities and is need, for example, for fetch-by-sha1
	to work when no refs are advertised.

	Git advertises the same capabilities^{} ref in its ref advertisement for
	push but since it never remembered to do so for fetch, the client forgot
	to handle this case. Handle it.

	So now you can see the same friendly message whether the server runs C
	git or JGit.

	The pack-protocol.txt documentation gives some guidance about the expected
	server behavior: what JGit does is currently required, since a list-of-refs
	is required to be non-empty.

		  advertised-refs  =  (no-refs / list-of-refs)
				      *shallow
				      flush-pkt

		  no-refs          =  PKT-LINE(zero-id SP "capabilities^{}"
				      NUL capability-list)

		  list-of-refs     =  first-ref *other-ref

	Because git client versions without this fix are expected to exist in the
	wild for a while, we should not change the server to always send the
	capabilities^{} line when there are no refs to advertise yet.  A transition
	will take multiple steps:

	 1. This patch, which updates the client

	 2. Update pack-protocol to clarify that both server behaviors must be
	    tolerated.

	 3. Add a configuration variable to allow git upload-pack to advertise
	    capabilities when there are no refs to advertise.  Leave it disabled
	    by default since git clients can't be counted on to have this patch (1)
	    yet.

	 4. After a year or so, flip the default for that server configuration
	    variable to true.

[...]
> --- a/connect.c
> +++ b/connect.c
> @@ -165,6 +165,13 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>  			continue;
>  		}
>  
> +		if (is_null_oid(&old_oid)) {
> +			if (strcmp(name, "capabilities^{}"))
> +				warning("zero object ID received that is not accompanied by a "
> +				        "capability declaration, ignoring and continuing anyway");
> +			continue;
> +		}

I'd rather this be stricter.  Though it's not your fault: the code
before it is very lax in letting each line specify capabilities again.

Could this behave more like the ".have" case?  That is, just

		if (!strcmp(name, "capabilities^{}"))
			continue;

Or if we want to sanity-check the line further,

		if (!strcmp(name, "capabilities^{}")) {
			if (len == name_len + GIT_SHA1_HEX_SZ + 1)
				die("protocol error: expected capabilities after nul, got none");
			if (!is_null_oid(&old_oid))
				die("protocol error: expected zero-id, got %s", oid_to_hex(&old_oid));
			continue;
		}

The protocol definition and other clients already handle capabilities^{} differently
from other cases of zero-id.

[...]
> --- a/t/t5512-ls-remote.sh
> +++ b/t/t5512-ls-remote.sh
> @@ -207,5 +207,27 @@ test_expect_success 'ls-remote --symref omits filtered-out matches' '
[...]
> +test_expect_success JGIT,GIT_DAEMON 'indicate no refs in standards-compliant empty remote' '
> +	JGIT_DAEMON_PID= &&
> +	git init --bare empty.git &&
> +	touch empty.git/git-daemon-export-ok &&
> +	{
> +		jgit daemon --port="$JGIT_DAEMON_PORT" . &
> +		JGIT_DAEMON_PID=$!
> +	} &&
> +	test_when_finished kill "$JGIT_DAEMON_PID" &&
> +	sleep 1 && # allow jgit daemon some time to set up

Can this sleep be avoided?  E.g. start_git_daemon in lib-git-daemon.sh
uses output from the daemon to tell when it is serving.

Thanks for writing this.  I'll be happy when this protocol blip is gone.

Sincerely,
Jonathan



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]