Re: [RFC PATCH v4 14/26] Add stateless RPC options to upload-pack, receive-pack

"Shawn O. Pearce" <spearce@xxxxxxxxxxx> · Thu, 29 Oct 2009 08:26:29 -0700

Junio C Hamano <gitster@xxxxxxxxx> wrote:
> "Shawn O. Pearce" <spearce@xxxxxxxxxxx> writes:
> 
> > When --stateless-rpc is passed [...]
> > When --advertise-refs is passed [...]
> 
> Is the idea to first run with --advertise-refs to get the set of refs, and
> then doing another, separate run with --stateless-rpc, assuming that the
> refs the other end advertised does not change in the meantime, to conclude
> the business?

Yes.

> I suspect that two separate invocations on a (supposedly) single
> repository made back-to-back can produce an inconsistent response in
> verious situations (e.g. somebody pushing in the middle, or the same
> hostname served by more than one mirrors) and the other end can get
> confused.

Yes, that can happen.

> I do not think there is any way to avoid it, short of adding to the second
> request some "cookie" that allows the second requestee to verify that the
> request is based on what he would give to the requester if this request
> were the first step of the request made to him, iow, his state did not
> change in the middle since the first request was made (either to him or to
> the equivalent mirror sitting next to himm).

There isn't a lot we can do, you are right.

One approach is to use an HMAC and sign each advertised SHA-1
during the initial --advertise-refs phase.  Requesters then present
the SHA-1 and the HAMC signature in each "want" line, and the
--stateless-rpc phase validates the signatures to ensure they came
from a trusted party.

The major problem with this approach is the private key management.
All mirrors of that repository need to have a common private key
so they can generate and later verify that HMAC signature.  This is
additional complexity, for perhaps not much gain.

A different approach is to have the --stateless-rpc phase validate
the want lines against its refs (like we do now), but if the validate
fails (no ref exactly matches) support walking backwards through the
last few records in the reflog, or through that ref's own history
for a few hundred commits, to see if the want SHA-1 appears in the
recent prior past.

Obviously when walking the reflog we would need to use a time bound,
e.g. "only examine last record if more recent than 5 minutes ago".
This way a force push to erase something will still erase it, but
a client who had just recently observed the prior SHA-1 can still
complete their fetch request, just as with native git:// they could
do that due to the prior SHA-1 being held in the git-upload-pack's
private memory.

Also, obviously when walking the commit history of a ref (to see
if the want'd SHA-1 is merged into one or more reachable refs)
we don't want to walk backwards too far, as it costs CPU time on
the server side, but we also don't want to walk too few commits.
E.g. pushes for me tend to be in the 20 commit range, while Linus
probably pushes a thousand commits or more in a single merge.
Finding the right balance may be tricky.

> I wouldn't worry too much about this if the only negative effect could be
> that the requestor's second request may result in an error, but I am
> wondering if this can be used to attack the requestee.

I don't think it can be used to attack the server.  The only impact
I can see is the client gets confused and gets an error response
from the server when the server aborts due to the invalid "want"
line sent during that 2nd (or any subsequent) request.

-- 
Shawn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html