Re: git over webdav: what can I do for improving http-push ?

Jan Hudec <bulb@xxxxxx> · Fri, 4 Jan 2008 20:59:11 +0100

On Fri, Jan 04, 2008 at 12:54:58 +1300, Martin Langhoff wrote:
> On Jan 4, 2008 10:15 AM, Jan Hudec <bulb@xxxxxx> wrote:
> > Now to keep it stateless, I thought that:
> ...
> > This would guarantee, that when you want n revisions, you make at most
> > log2(n) requests and get at most 2*n revisions (well, the requests are for
> 
> That is still a lot! How about, for each ref

The whole point of that is that the packs can be statically precomputed and
served with quite low CPU load, which is useful for serving from shared
computers (like servers in school computer labs or cheapo web hosting) or
slow servers like NSLU2. Also it makes HTTP caching actually useful, because
the set of possible requests is quite limited.

Also, while I said it's for each ref, the packs should really be optimized
for the common case of fetching all refs, which would really make it just
log2(n) packs and 2*n revisions for each whole download.

>  - Client sends a POST listing the ref and the latest related commit
> it has that the server is likely to have (from origin/heads/<ref>).
> Optionally, it can provide a blacklist of <treeish> (where every
> object refered is known) and blob sha1s.
>  - Server sends the new sha1 of the ref, and a thin pack that covers the changes
>  - The client can disconnect to stop the transaction. For example --
> if it sees the sha1 of a huge object that it already has. It can
> re-request, with a blacklist.
> 
> A good number of objects will be sent unnecesarily - with no option to
> the client to say "I have this" - but by using the hint of letting the
> server know we have origin/heads/<ref> I suspect that it will be
> minimal.

It would be better to only unnecesarily send revlists. Since each HTTP packed
will likely have something like 1kb overhead, sending few kb worth of revlist
is still pretty efficient. So just send part of revlist, than more revlist
and so on until you find exactly which revisions you need and than ask for
them. That will save *both* bandwidth *and* server CPU. The only reason to
waste bandwidth is to save CPU and you are not doing that.

> Also:
>  - It will probably be useful to list all the refs the client knows
> from that server in the request.
>  - If the ref has changed with a non-fast-forward, the server needs to
> say so, and provide a listing of the commits. As soon as the client
> spots a common commit, it can close the connection -- it now knows
> what ref to tell the server about in a subsequent command.
> 
> This way, you ideally have 1 request per ref, 2 if it has been
> rebased/rewound. This can probably get reorganised to do several refs
> in one request.
> 
> cheers,
> 
> 
> m
-- 
						 Jan 'Bulb' Hudec <bulb@xxxxxx>
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html