Re: [PATCH 0/2] Optionally support push options on up-to-date branches

Christopher Lindee <christopher.lindee@xxxxxxxxxxx> · Fri, 15 Mar 2024 08:58:35 +0000

brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> writes:

> On 2024-03-14 at 23:29:40, Junio C Hamano wrote:
> > "brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx> writes:
> > > As a server operator, I also expect that there
> > > are people doing lots of needless attempts at pushing in automated
> > > systems (because with enough users, there will be at least some who do
> > > bizarre or inefficient things), and I would prefer to avoid serving
> > > those requests if I don't need to.  (For example, for us, reference
> > > updates need to go through a distributed commit protocol to update
> > > multiple replicas of the repository, and if there's no ref updates, then
> > > we cut out multiple services which we don't need to contact.)
> >
> > Yes, but if you have an extra no-op ref update in the bunch, that
> > one is excluded from the set of changes to be synchronised across
> > replicas, no?
> 
> 
> As I said, we can't do that, because we have to verify that the old
> value is correct.  Even if we did perform that optimization, once we
> know we have a write, we have to contact every replica (in case the user
> does indeed send us a pack, which is allowed by the protocol, even if
> bizarre), whereas with a ref request or a read, we can satisfy that with
> a single read replica.  I'm sure I can satisfy at least one to two
> orders of magnitude more no-op push attempts with the current
> optimization than I could if we didn't have it.  It might even be to the
> point of three or more orders of magnitude.  (My guess is that reftable
> will improve this even further, but I haven't yet measured.)
> 
> 
> I'm sure this is also true for most other implementations; serving
> reads, like ls-refs, are almost always substantially cheaper than writes
> and most server implementations are highly optimized for serving large
> numbers of reads due to CI farms and such.
> 
> 
> > > Note also that no-op ref updates cannot be simply omitted on the server
> > > side because we need to verify that the old value for the ref is
> > > correct, or we need to reject the operation as out of date.
> >
> > Yes, but isn't that something the user would rather want to find out
> > earlier rather than later?  Your push without no-op update may say
> > "ah, we are up to date" when we are behind or worse diverged.  If we
> > do the no-op update more often, we'd have more chance to catch such
> > a situation where the worldview the end-user's repository has is out
> > of sync with reality.
> 
> 
> Well, we were up to date at the time the ls-refs operation completed, or
> the push wouldn't have the same old and new OIDs.  The client has to
> declare the old ref value that it got from the ls-refs output in order
> for the operation to succeed, since the server-side Git won't update the
> references if the old value doesn't match.  (This is how the client
> knows whether a push is a fast-forward or not: it computes it, and if
> the remote side has an unknown value, then it's not.)  Other than
> updating the push options, there's no functional benefit to making a
> second request, since the ls-refs output already told us where the
> branch is.  That's why the current optimization is functionally correct.
> 
> 
> In any event, there are alternate implementations of the Git protocol
> which did not implement our current optimization and did, for purposes
> of mirroring, make excessive numbers of no-op requests, and we did have
> to ask for that to be fixed.  I assure you, the no-op behaviour is
> better for users because it's faster and substantially more likely to
> succeed (because it is effectively a read-only ls-refs request), whereas
> a new ref update is less efficient, slower, and substantially more
> likely to fail (due to functional limitations on write throughput).
> 
> 
> I also expect there will be poorly written `pre-push` and `pre-receive`
> hooks which will not like getting a no-op update and not handle it
> gracefully.  I'm pretty sure Git LFS will handle this just fine, but
> lots of people have various shell scripts that probably will not.

Is this a potential avenue for DoS?

Clearly, I'm not the first to send no-op ref updates, so this change
doesn't create the problem, but it could allow non-coders to exploit
existing problems much more easily.

If this could be a problem, we can take the time now to solve it.  A
simple limit on the number of no-op refs received before the server
terminates with REF_STATUS_REMOTE_REJECT should be sufficient, right?
If we bake that expectation into the official Git implementation, then
other implementations will be empowered to take similar steps.

For at least one service[1], GitLab only supports Push Options on, at
most, 10 refs.  If other server operators have similar limitations,
then this is an easy bud to nip.

Regards,
Chris.

[1] https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/merge_requests/push_options_handler_service.rb?ref_type=heads#L5