Hi, Jonathan Tan wrote: > connect.c, when processing packfiles, treats a zero ID (with > `capabilities^{}` in place of the refname) as an actual ref instead of a > placeholder for a capability declaration, contrary to the specification > in Reference Discovery in Documentation/technical/pack-protocol.txt. > This is an issue when interacting with Git implementations that follow > this specification. For example, `ls-remote` (against a git:// > repository served by such a Git implementation) will report a ref when > there are none, and `clone` (against something similar) will fail its > checkout. For example, this means I get the following confusing message when cloning an empty repository served by "jgit daemon": $ git clone https://googlers.googlesource.com/jrn/empty Cloning into 'empty'... Checking connectivity... done. warning: remote HEAD refers to nonexistent ref, unable to checkout. $ It also means that standard "git daemon" is not able to advertise capabilities on a fetch from a repository without advertising some refs, so I cannot fetch-by-sha1 from a C git server unless some refs are advertised. This fix should solve the former problem and paves the way for the latter to be solved once a year or two passes and people's clients are up to date (or earlier if a server operator chooses). > Make connect.c follow the specification with respect to this, while > maintaining compatibility with existing implementations that do not > serve the zero ID when a repository has no refs. This description focuses on the code. When deciding whether to apply the patch (or whether to revert it if it comes up while trying to investigate another problem), I am likely to wonder "what effect will the patch have on me?" instead of "what standards does it follow?" Flipping the emphasis, we could say something like connect: treat ref advertisement with capabilities^{} line as one with no refs When cloning an empty repository served by standard git, "git clone" produces the following reassuring message: $ git clone git://localhost/tmp/empty Cloning into 'empty'... warning: You appear to have cloned an empty repository. Checking connectivity... done. Meanwhile when cloning an empty repository served by JGit, the output is more haphazard: $ git clone git://localhost/tmp/empty Cloning into 'empty'... Checking connectivity... done. warning: remote HEAD refers to nonexistent ref, unable to checkout. This is a common command to run immediately after creating a remote repository as preparation for adding content to populate it and pushing. The warning is confusing and needlessly worrying. The cause is that, since v3.1.0.201309270735-rc1~22 (Advertise capabilities with no refs in upload service., 2013-08-08), JGit's ref advertisement includes a ref named capabilities^{} to advertise its refs on, while git's ref advertisement is empty in this case. This allows the client to learn about the server's capabilities and is need, for example, for fetch-by-sha1 to work when no refs are advertised. Git advertises the same capabilities^{} ref in its ref advertisement for push but since it never remembered to do so for fetch, the client forgot to handle this case. Handle it. So now you can see the same friendly message whether the server runs C git or JGit. The pack-protocol.txt documentation gives some guidance about the expected server behavior: what JGit does is currently required, since a list-of-refs is required to be non-empty. advertised-refs = (no-refs / list-of-refs) *shallow flush-pkt no-refs = PKT-LINE(zero-id SP "capabilities^{}" NUL capability-list) list-of-refs = first-ref *other-ref Because git client versions without this fix are expected to exist in the wild for a while, we should not change the server to always send the capabilities^{} line when there are no refs to advertise yet. A transition will take multiple steps: 1. This patch, which updates the client 2. Update pack-protocol to clarify that both server behaviors must be tolerated. 3. Add a configuration variable to allow git upload-pack to advertise capabilities when there are no refs to advertise. Leave it disabled by default since git clients can't be counted on to have this patch (1) yet. 4. After a year or so, flip the default for that server configuration variable to true. [...] > --- a/connect.c > +++ b/connect.c > @@ -165,6 +165,13 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len, > continue; > } > > + if (is_null_oid(&old_oid)) { > + if (strcmp(name, "capabilities^{}")) > + warning("zero object ID received that is not accompanied by a " > + "capability declaration, ignoring and continuing anyway"); > + continue; > + } I'd rather this be stricter. Though it's not your fault: the code before it is very lax in letting each line specify capabilities again. Could this behave more like the ".have" case? That is, just if (!strcmp(name, "capabilities^{}")) continue; Or if we want to sanity-check the line further, if (!strcmp(name, "capabilities^{}")) { if (len == name_len + GIT_SHA1_HEX_SZ + 1) die("protocol error: expected capabilities after nul, got none"); if (!is_null_oid(&old_oid)) die("protocol error: expected zero-id, got %s", oid_to_hex(&old_oid)); continue; } The protocol definition and other clients already handle capabilities^{} differently from other cases of zero-id. [...] > --- a/t/t5512-ls-remote.sh > +++ b/t/t5512-ls-remote.sh > @@ -207,5 +207,27 @@ test_expect_success 'ls-remote --symref omits filtered-out matches' ' [...] > +test_expect_success JGIT,GIT_DAEMON 'indicate no refs in standards-compliant empty remote' ' > + JGIT_DAEMON_PID= && > + git init --bare empty.git && > + touch empty.git/git-daemon-export-ok && > + { > + jgit daemon --port="$JGIT_DAEMON_PORT" . & > + JGIT_DAEMON_PID=$! > + } && > + test_when_finished kill "$JGIT_DAEMON_PID" && > + sleep 1 && # allow jgit daemon some time to set up Can this sleep be avoided? E.g. start_git_daemon in lib-git-daemon.sh uses output from the daemon to tell when it is serving. Thanks for writing this. I'll be happy when this protocol blip is gone. Sincerely, Jonathan