Re: [RFC/WIP PATCH 11/11] Document protocol version 2

Jeff King <peff@xxxxxxxx> · Fri, 29 May 2015 18:21:20 -0400

On Fri, May 29, 2015 at 02:52:14PM -0700, Junio C Hamano wrote:

> > Currently we can do a = as part of the line after the first ref, such as
> >
> >     symref=HEAD:refs/heads/master agent=git/2:2.4.0
> >
> > so I thought we want to keep this.
> 
> I do not understand that statement.
> 
> Capability exchange in v2 is one packet per cap, so the above
> example would be expressed as:
> 
> 	symref=HEAD:refs/heads/master
>         agent=git/2:2.4.0
> 
> right?  Your "keyvaluepair" is limited to [a-z0-9-_=]*, and neither
> of the above two can be expressed with that, which was why I said
> you need two different set of characters before and after "=".  Left
> hand side of "=" is tightly limited and that is OK.  Right hand side
> may contain characters like ':', '.' and '/', so your alphabet need
> to be more lenient, even in v1 (which I would imagine would be "any
> octet other than SP, LF and NUL").

Yes. See git_user_agent_sanitized(), for example, which allows basically
any printable ASCII except for SP.

I think the v2 capabilities do not even need to have that restriction.
It can allow arbitrary binary data, because it has an 8bit-clean framing
mechanism (pkt-lines). Of course, that means such capabilities cannot be
represented in a v1 conversation (whose framing mechanism involves SP
and NUL). But it's probably acceptable to introduce new capabilities
which are only available in a v2 conversation. Old clients that do not
understand v2 would not understand the capability either. It does
require new clients implementing the capability to _also_ implement v2
if they have not done so, but I do not mind pushing people in that
direction.

The initial v2 client implementation should probably do a few cautionary
things, then:

  1. Do _not_ fold the per-pkt capabilities into a v1 string; that loses
     the robust framing. I suggested string_list earlier, but probably
     we want a list of ptr/len pair, so that it can remain NUL-clean.

  2. Avoid holding on to unknown packets longer than necessary. Some
     capability pkt-lines may be arbitrarily large (up to 64K). If we do
     not understand them during the v2 read of the capabilities, there
     is no point hanging on to them. It's not _wrong_ to do so, but just
     inefficient; if we know that clients will just throw away unknown
     packets, then we can later introduce new packets with large data,
     without worrying about wasting the client's resources.

     I suspect it's not that big a deal either way, though. I have no
     plans for sending a bunch of large packets, and anyway network
     bandwidth is probably more precious than client memory.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html