On Wed, Apr 24 2019, Jonathan Nieder wrote: > Ævar Arnfjörð Bjarmason wrote: >> On Wed, Apr 24 2019, Jonathan Nieder wrote: >>> Jeff King wrote: >>>> On Fri, Mar 08, 2019 at 01:55:17PM -0800, Jonathan Tan wrote: > >>>>> +If the 'packfile-uris' feature is advertised, the following argument >>>>> +can be included in the client's request as well as the potential >>>>> +addition of the 'packfile-uris' section in the server's response as >>>>> +explained below. >>>>> + >>>>> + packfile-uris <comma-separated list of protocols> >>>>> + Indicates to the server that the client is willing to receive >>>>> + URIs of any of the given protocols in place of objects in the >>>>> + sent packfile. Before performing the connectivity check, the >>>>> + client should download from all given URIs. Currently, the >>>>> + protocols supported are "http" and "https". >>>> >>>> This negotiation seems backwards to me, because it puts too much power >>>> in the hands of the server. >>> >>> Thanks. Forgive me if this was covered earlier in the conversation, but >>> why do we need more than one protocol at all here? Can we restrict this >>> to only-https, all the time? >> >> There was this in an earlier discussion about this: >> https://public-inbox.org/git/877eds5fpl.fsf@xxxxxxxxxxxxxxxxxxx/ >> >> It seems arbitrary to break it for new features if we support http in >> general, especially with a design as it is now where the checksum of the >> pack is transmitted out-of-band. > > Thanks for the pointer. TLS provides privacy, too, but I can see why > in today's world it might not always be easy to set it up, and given > that we have integrity protection via that checksum, I can see why > some people might have a legitimate need for using plain "http" here. > > We may also want to support packfile-uris using SSH protocol in the > future. Might as well figure out how the protocol negotiation works > now. So let's delve more into it: > > Peff mentioned that it feels backwards for the client to specify what > protocols they support in the request, instead of the server > specifying them upfront in the capability advertisement. I'm inclined > to agree: it's probably reasonable to put this in server capabilities > instead. That would even allow the client to do something like > > This server only supports HTTP without TLS, which you have > indicated is a condition in which you want to be prompted. > Proceed? > > [Use HTTP packfiles] [Use slower but safer inline packs] > > Peff also asked whether protocol scheme is the right granularity: > should the server list what domains they can serve packfiles from > instead? In other words, once you're doing it for protocol schemes, > why not do it for whole URIs too? I'm grateful for the question since > it's a way to probe at design assumptions. > > - protocol schemes are likely to be low in number because each has its > own code path to handle it. By comparison, domains or URIs may be > too numerous to be something we want to jam into the capability > advertisement. (Or the server operator could always use the same > domain as the Git repo, and then use a 302 to redirect to the CDN. > I suspect this is likely to be a common setup anyway: it allows the > Git server to generate a short-lived signed URL that it uses as the > target of a 302. But in this case, what is the point of a domain > whitelist?) > > - relatedly, because the list of protocol schemes is small, it is > feasible to test client behavior with each subset of protocol > schemes enabled. Finer-grained filtering would mean more esoteric > client configurations for server operators to support and debug. > > - supported protocol schemes do not vary per request. The actual > packfile URI is dynamic and varies per request > > - separately from questions of preference or security policy, > clients may have support for a limited subset of protocol schemes. > For example, imagine a stripped-down client without SSH support. > So we need a way to agree about this capability anyway. > > So I suspect that, at least to start, protocol scheme negotiation > should be enough and we don't need full URI negotiation. > > There are a few escape valves: > > - affected clients can complain to the server operator, who will then > reconfigure the server to use more appropriate packfile URIs > > - if there is a need for different clients to use different packfile > URIs, clients can pass a flag, using --server-option, to the server > to help it choose. > > - a client can disable support for packfile URIs on a particular > request and fall back to inline packs. > > - if and when an affected client materializes, they can help us > improve the protocol to handle their needs. > > Sensible? Food for thought: would we consider ssh->https a "downgrade"? I think "maybe". We're going from whatever custom setting the user has (e.g. manually approve new hosts) to the CA system. But I think it would be fine to just only whitelist ssh->https and ban everything else behind a very scary config option or something, we could always fleshen out the semantics of upgrade/downgrade/switching later, and it would IMO suck less than outright banning a protcol we otherwise support in the design, and which (unlike git://) is something people are still finding uses for in the wild for non-legacy reasons.