Re: [BUGREPORT] Why is git-push fetching content?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2023-02-21 at 22:01:04, Sean Allred wrote:
> What did you do before the bug happened? (Steps to reproduce your issue)
> 
>     # in a new directory,
>     cd $(mktemp -d)
> 
>     # initialize a new repository
>     git init
> 
>     # fetch a single commit from a remote
>     git fetch --filter=tree:0 --depth=1 $REMOTE $COMMIT_OID
> 
>     # create a ref on that remote
>     git push --no-verify $REMOTE $COMMIT_OID:$REFNAME
> 
> What did you expect to happen? (Expected behavior)
> 
>     I expected this process to complete very, very quickly. We believe
>     the version where it had been doing so was ~2.37.
> 
> What happened instead? (Actual behavior)
> 
>     The fetch completes nearly instantly as expected. We receive ~200B
>     from the remote for the commit object itself. What's truly bizarre
>     is what happens during the push. It starts receiving objects from
>     the remote! By the end of this process, the local repository is a
>     whopping ~700MB -- though interestingly only about a tenth of the
>     full repository size.
> 
>     This result in particular is strange in context. I would expect to
>     either see 'almost all' the repository content, 'about half' (we
>     have two trunks and fetching a single commit would at most fetch one
>     of them), or 'virtual none at all'. There isn't a straightforward
>     explanation for why 'one tenth' would make sense.

It's hard to know for certain what's going on here, but it depends on
your history.  You did a partial clone with no trees, so you've likely
received a single commit object and no trees or blobs.

However, when you push a commit, that necessitates pushing the trees and
blobs as well, and you don't have those.  If the remote said that it
already had the commit, then it might push no objects at all (which I've
seen before) and thus just update the references.  However, if it pushes
even one commit, it may need to walk the history and find common
commits, which will necessitate fetching objects, and it will have to
push any trees and blobs as well, which also will require objects to be
fetched.

My guess is that this is probably made worse by the fact that this is
shallow, and that necessitates certain additional computations, which
means more objects are fetched. However, I'm not super sure how that
code works, so I think it may be helpful for someone else to chime in
who's more familiar with this.

If you want to see what's going on, you can run with
`GIT_TRACE=1 GIT_TRACE_PACKET=1`, which may show interesting information
about the negotiation.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux