Re: [PATCH 0/3] fix unparsed object access in upload-pack

Jeff King <peff@xxxxxxxx> · Sun, 17 Mar 2013 04:47:51 -0400

On Sat, Mar 16, 2013 at 11:17:18PM -0700, Junio C Hamano wrote:

> > I almost wonder if we should cut it out entirely. It is definitely a
> > possible race condition, but I wonder if anybody actually hits it in
> > practice (and if they do, the consequence is that the fetch fails and
> > needs to be retried). As far as I can tell, the code path has never
> > actually been followed, and I do not recall ever seeing a bug report or
> > complaint about it (though perhaps it happened once, which spurred the
> > initial development?).
> 
> If you run multiple servers serving the same repository at the same
> URL with a small mirroring lag, one may observe a set of refs from
> one server, that are a tad older than the other server you actually
> fetch from.  k.org may have such an arrangement, but does GitHub
> serve the same repository on multiple machines without tying the
> same client to the same backend?

Each repository is a on a single backend host. They're redundant
internally (each host is actually multiple hosts), but pure-git requests
go to a single master for each host (though for some read-only
operations I think we spread the load across the redundant spares). You
might get a separate machine during a failover event, but they share
block devices via DRBD, so in theory an fsync() should hit both
machines, and there is no lag (and you are likely to get an intermittent
failure in such a case, anyway, since the machine serving your git
request probably died mid-packet).

I thought this change was to prevent against the common race:

  1. Client request stateless ref advertisement.

  2. Somebody updates ref.

  3. Client requests "want" objects based on old advertisement.

and I think it does solve that (assuming step 2 is not a rewind). The
important thing is that time always moves forward.

But if you are talking about mirror lag, time can move in either
direction. Imagine you have two machines, A and B, and A is missing an
update to B. If you hit A first, then B, it is the same as the update
sequence above. The patch helps. But if you hit B first, then A, you
will ask it for objects it has not yet received, and we must fail.

So I think any such mirroring setup would want to try very hard to make
sure you hit the same machine both times anyway, regardless of this
patch.

I'm fine to leave it. I was just questioning its utility since AFAICT,
it has never worked and nobody has cared. It's not too much code,
though, and it is only run when we hit the race, so I don't think it is
hurting anything.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html