Re: Comments pack protocol description in "Git Community Book" (second round)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 7 June 2009, Shawn O. Pearce wrote:
> Jakub Narebski <jnareb@xxxxxxxxx> wrote:

> > > ### Fetching Data with Upload Pack ###
> ...
> > Although fetching via SSH protocol is, I guess, much more rare than
> > fetching via anonymous unauthenticated git:// protocol,
> 
> Actually, fetching via SSH might be quite common, think about all of
> those companies using Git internally... they are running something
> like Gitosis or Gerrit Code Review, both of which support SSH only
> access to the hosted repositories.

I am blind... I forgot that Git is not only used for F/OSS software,
but also for developing proprietary code in company intranets; here
you have limited number of people you want to have read access to
repository.

> 
> > it _might_ be
> > good idea to tell there that fetching via SSH differs from above
> > sequence that instead of opening TCP connection to port 9418 and
> > sending above packet, and later reading from and writing to socket,
> > "git clone ssh://myserver.com/srv/git/project.git" calls
> > 
> > 	ssh myserver.com git-upload-pack /srv/git/project.git
> > 
> > and later reads from standard output of the above command, and writes
> > to standard input of above command.
> 
> Yes, this should be mentioned.  We actually should document in
> the protocol specifiction how we fork SSH, and what the SSH server
> should then be presenting as the command line.
> 
> I've run into problems with hosting sites like GitHub and Gitoriuous
> not correctly honoring some ssh invokes, because they use the forced
> command execution model and were handling only one case that could
> be presented to them.

Can you offer some details?  Or is it out of scope of git pack protocol
description, and more about correctly implementing SSH protocol and
remote command invocation in it?

> 
> > The rest of exchange is _identical_ for git:// and for ssh:// (and
> > I guess also for file:// pseudoprotocol).
> 
> Yes, the file:// pseduoprotocol works by forking a child to run the
> `git-upload-pack /srv/git/project.git` executable and runs a pair
> of pipes between them, just like ssh:// does when it spawns off
> the ssh client process.

That would be nice information to have for people (re)implementing Git,
I think.

Sidenote: it will be the same for planned "smart" HTTP protocol, but 
for the fact that HTTP is stateless, and additionally some kind of state
information would have to be passed.

> > Footnote: [pkt-line format] somewhat reminds / resembles 'chunked' transfer
> > encoding used in HTTP[1], although there are differences.
> >   http://en.wikipedia.org/wiki/Chunked_transfer_encoding
> 
> This is not worth mentioning.  pkt-line is different enough that
> it may just confuse the reader.

O.K. 

I mentioned it because it also uses hexadecimal for length.

>  
> > Below there is (for completeness) list of git-upload-pack
> > capabilities, with short description of each:
> > 
> >  * multi_ack (for historical reasons not multi-ack)
> ...
> >    See the thread for more details (posts by Shawn O. Pearce and by
> >    Junio C Hamano).
> 
> This really needs a diagram example, like the one I drew, to
> explain the concept.  Its really hard to grasp from just reading
> that paragraph what that implies, especially if you are implementing
> a client or a server.

While I don't think that one would have to describe Git object model,
and Git repository storage model (the Git repository storage model, 
i.e. loose object format, and packfile and packfile index format,
and everything else in .git should be described in separate RFC-like
document, in my opinion), it would be helpful to describe "history DAG"
model Git uses, and a bit about revision walking.  What use would be
describing git pack protocol, if the idea behind it, namely coming up
with optimal packfile to send won't be understood?

> >  * no-progress
> > 
> >    Client should use it if it was started with "git clone -q" or
> >    something, and doesn't want that side brand 2.  We still want
> 
> typo, should be "... side band 2." :-)
> 
> >    sideband 1 with actual data (packfile), and sideband 3 with error
> >    messages.
> 
> Also, this capability really only makes sense if side-band or
> side-band-64k was requested.  IOW, a sane client wouldn't ask
> for this if it doesn't support side-band.

Right. "no-progress" makes sense only in context of sideband, currently
"side-band" and "side-band-64k". For server it means that it MUST send
(currently) only streams 1 (data) and 3 (fatal error); conversely it
MUST NOT send stream 2 (progress).

-- 
Jakub Narebski
Poland
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]