Re: Huge push upload despite only having a tiny change

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



(cc: Jonathan Tan for "git push" discussion; Minh Thai for negotiate
 hook discussion)
Hi,

Elijah Newren wrote:

> I had a user report that two nearly identical pushes (the second being
> an amended commit of the first) took dramatically differing amounts of
> time and amount of data uploaded (from 4.5 seconds and about 21k
> uploaded, to 223 seconds and over 100 MB uploaded).

Yes, this is why I want push negotiation.  (It's been something we've
been discussing for protocol v2 for push.)

If they fetch before they push, does that help?

[...]
> * The server was running Gerrit 3.1.4 (i.e. jgit).

Gerrit servers have the interesting property that many people are
pushing to the same Git repo.  (This is common in some other hosting
scenarios such as Gitlab, but the most common case among Git users
still seems to be pushing to a repo you own.)

When you push, because there's no negotiation phase, the only
information we have about what is present on the server is what is in
the ref advertisement.  (We have remote-tracking branches which seem
potentially useful, but we don't have a way to ask the server "are
these objects you still have?")  The ref advertisement describes the
*current* state of all refs.  If I am pushing a new topic branch (in
Gerrit jargon, a new change for review) based on the *old* state of a
branch that has moved on, then we can only hope that some other ref
(for example a tag) points to a recent enough state to give us a base
for what to upload.

There is one trick a server can use to mitigate this: advertise some
refs that don't exist!  If you advertise a ref ".have", then Git
will understand that the server has that object but it is not an
actual ref.  Gerrit uses this trick in its HackPushNegotiateHook[1]
to advertise a few recent commits.

At $DAYJOB we ran into some clients where "a few recent commits" was
not sufficient to get to history that the client is aware of.  We
tried changing it to do some exponential deepening, and that helped.
We should probably upstream that change for other Gerrit users.

Gerrit also advertises some other ".have"s, for example for recent
changes by the same author in case you're uploading an amended
version.  That's less relevant here.

But fundamentally, this is something that cannot be addressed properly
without improving the "git push" protocol (adding a negotiation
phase).

Summary: (1) try fetching first (2) let's improve
HackPushNegotiateHook#advertiseRefs (3) let's improve "git push"
protocol to make this a problem of the past.

Thanks and hope that helps,
Jonathan

[1] https://gerrit.googlesource.com/gerrit/+/e1f4fee1f3ce674f44cb9788e6798ff8522bb876/java/com/google/gerrit/server/git/receive/HackPushNegotiateHook.java#111



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux