Re: Huge push upload despite only having a tiny change

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Tue, Jun 2, 2020 at 6:53 PM Jonathan Nieder <jrnieder@xxxxxxxxx> wrote:
>
> (cc: Jonathan Tan for "git push" discussion; Minh Thai for negotiate
>  hook discussion)
> Hi,
>
> Elijah Newren wrote:
>
> > I had a user report that two nearly identical pushes (the second being
> > an amended commit of the first) took dramatically differing amounts of
> > time and amount of data uploaded (from 4.5 seconds and about 21k
> > uploaded, to 223 seconds and over 100 MB uploaded).
>
> Yes, this is why I want push negotiation.  (It's been something we've
> been discussing for protocol v2 for push.)
>
> If they fetch before they push, does that help?

In my attempts to reproduce which I assume are similar, yes it helps immensely.

> [...]
> > * The server was running Gerrit 3.1.4 (i.e. jgit).
>
> Gerrit servers have the interesting property that many people are
> pushing to the same Git repo.  (This is common in some other hosting
> scenarios such as Gitlab, but the most common case among Git users
> still seems to be pushing to a repo you own.)
>
> When you push, because there's no negotiation phase, the only
> information we have about what is present on the server is what is in
> the ref advertisement.  (We have remote-tracking branches which seem
> potentially useful, but we don't have a way to ask the server "are
> these objects you still have?")  The ref advertisement describes the
> *current* state of all refs.  If I am pushing a new topic branch (in
> Gerrit jargon, a new change for review) based on the *old* state of a
> branch that has moved on, then we can only hope that some other ref
> (for example a tag) points to a recent enough state to give us a base
> for what to upload.
>
> There is one trick a server can use to mitigate this: advertise some
> refs that don't exist!  If you advertise a ref ".have", then Git
> will understand that the server has that object but it is not an
> actual ref.  Gerrit uses this trick in its HackPushNegotiateHook[1]
> to advertise a few recent commits.
>
> At $DAYJOB we ran into some clients where "a few recent commits" was
> not sufficient to get to history that the client is aware of.  We
> tried changing it to do some exponential deepening, and that helped.
> We should probably upstream that change for other Gerrit users.
>
> Gerrit also advertises some other ".have"s, for example for recent
> changes by the same author in case you're uploading an amended
> version.  That's less relevant here.
>
> But fundamentally, this is something that cannot be addressed properly
> without improving the "git push" protocol (adding a negotiation
> phase).
>
> Summary: (1) try fetching first (2) let's improve
> HackPushNegotiateHook#advertiseRefs (3) let's improve "git push"
> protocol to make this a problem of the past.
>
> Thanks and hope that helps,
> Jonathan
>
> [1] https://gerrit.googlesource.com/gerrit/+/e1f4fee1f3ce674f44cb9788e6798ff8522bb876/java/com/google/gerrit/server/git/receive/HackPushNegotiateHook.java#111

Yes, this helps a lot.  Thanks for the pointers and explanations!



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux