Hi, On Tue, Jun 2, 2020 at 6:53 PM Jonathan Nieder <jrnieder@xxxxxxxxx> wrote: > > (cc: Jonathan Tan for "git push" discussion; Minh Thai for negotiate > hook discussion) > Hi, > > Elijah Newren wrote: > > > I had a user report that two nearly identical pushes (the second being > > an amended commit of the first) took dramatically differing amounts of > > time and amount of data uploaded (from 4.5 seconds and about 21k > > uploaded, to 223 seconds and over 100 MB uploaded). > > Yes, this is why I want push negotiation. (It's been something we've > been discussing for protocol v2 for push.) > > If they fetch before they push, does that help? In my attempts to reproduce which I assume are similar, yes it helps immensely. > [...] > > * The server was running Gerrit 3.1.4 (i.e. jgit). > > Gerrit servers have the interesting property that many people are > pushing to the same Git repo. (This is common in some other hosting > scenarios such as Gitlab, but the most common case among Git users > still seems to be pushing to a repo you own.) > > When you push, because there's no negotiation phase, the only > information we have about what is present on the server is what is in > the ref advertisement. (We have remote-tracking branches which seem > potentially useful, but we don't have a way to ask the server "are > these objects you still have?") The ref advertisement describes the > *current* state of all refs. If I am pushing a new topic branch (in > Gerrit jargon, a new change for review) based on the *old* state of a > branch that has moved on, then we can only hope that some other ref > (for example a tag) points to a recent enough state to give us a base > for what to upload. > > There is one trick a server can use to mitigate this: advertise some > refs that don't exist! If you advertise a ref ".have", then Git > will understand that the server has that object but it is not an > actual ref. Gerrit uses this trick in its HackPushNegotiateHook[1] > to advertise a few recent commits. > > At $DAYJOB we ran into some clients where "a few recent commits" was > not sufficient to get to history that the client is aware of. We > tried changing it to do some exponential deepening, and that helped. > We should probably upstream that change for other Gerrit users. > > Gerrit also advertises some other ".have"s, for example for recent > changes by the same author in case you're uploading an amended > version. That's less relevant here. > > But fundamentally, this is something that cannot be addressed properly > without improving the "git push" protocol (adding a negotiation > phase). > > Summary: (1) try fetching first (2) let's improve > HackPushNegotiateHook#advertiseRefs (3) let's improve "git push" > protocol to make this a problem of the past. > > Thanks and hope that helps, > Jonathan > > [1] https://gerrit.googlesource.com/gerrit/+/e1f4fee1f3ce674f44cb9788e6798ff8522bb876/java/com/google/gerrit/server/git/receive/HackPushNegotiateHook.java#111 Yes, this helps a lot. Thanks for the pointers and explanations!