On Tue, Jun 2, 2020 at 12:40 PM Derrick Stolee <stolee@xxxxxxxxx> wrote: > > On 6/2/2020 3:21 PM, Elijah Newren wrote: > > * The user was two commits behind the closely-related branch at the > > time of the first push, and 10 commits behind at the time of the > > second push. Running format-patch on these 10 commits that were on > > the server at the time shows their size is at most about ~55 k. > > This is most-likely the difference, since the pack-objects algorithm > only looks at the _boundary_ between the server's commits and the > commits-to-push. This also could have dramatically changed the delta-base > matches. > > Do you have exact object counts? It would help to know if somehow the > object discovery algorithm is at fault or the delta-base algorithm > is to blame. Output from the users' terminal for the two different runs, just redacting URLs. The first push was: Enumerating objects: 23, done. Counting objects: 100% (23/23), done. Delta compression using up to 16 threads Compressing objects: 100% (7/7), done. Writing objects: 100% (12/12), 952 bytes | 952.00 KiB/s, done. Total 12 (delta 5), reused 0 (delta 0), pack-reused 0 remote: Resolving deltas: 100% (5/5) remote: Processing changes: refs: 1, new: 1, done remote: commit 5efcf04: warning: subject >50 characters; use shorter first paragraph remote: remote: SUCCESS remote: remote: https://gerrit.internal.site/c/path/to/repo/+/341489 Remove ambiguous method from ArtifactProducerPeeringService [NEW] remote: To gerrit.internal.site:path/to/repo.git * [new branch] HEAD -> refs/for/develop The second was: Enumerating objects: 325816, done. Counting objects: 100% (266559/266559), done. Delta compression using up to 16 threads Compressing objects: 100% (102457/102457), done. Writing objects: 100% (257630/257630), 102.88 MiB | 619.00 KiB/s, done. Total 257630 (delta 122169), reused 218776 (delta 87259), pack-reused 0 remote: Resolving deltas: 100% (122169/122169) remote: Processing changes: refs: 1, updated: 1, done remote: commit 360a266: warning: subject >50 characters; use shorter first paragraph remote: remote: SUCCESS remote: remote: https://gerrit.internal.site/c/path/to/repo/+/341489 Remove ambiguous method from ArtifactProducerPeeringService remote: To gerrit.internal.site:path/to/repo.git * [new branch] HEAD -> refs/for/develop > For instance, `pack.useSparse` was enabled by default this release, > and has some opportunity to push extra objects. See [1] for more > details on both the "boundary" description (the "commit frontier") > but also how that option changes the algorithm. > > The only case I know of that could lead to sending extra objects > (that was not the case before) is described in t5322-pack-objects-sparse.sh > and 4f6d26b1 (list-objects: consume sparse tree walk, 2019-01-16). > It involves doing a full _copy_ of a tree from one position to > another, without "disturbing" the parent of the original tree. > > (I mean: copy directory A/B to C/D and make sure nothing is > different in directory A.) > > However, if these two pushes were with the same config setting, > I'm not sure what could have changed between the pushes to hit > this very narrow case. So, after a day of attempting to figure out how to debug, I found out we had a backup of the server between the first and second pushes and fairly close to the second push. I was able to launch a separate VM of that backup, and then attempt to make a local repo that I hoped mimic what the user had. I can't duplicate the 100MB push (which is about 25000x bigger than expected) as the user did, but I can with some tweaking state that the push size should be closer to ~2.5 KB and I can readly duplicate pushes in the 4-6 MB range -- i.e. about 1000x bigger than expected. pack.useSparse affects things, but not much. So, some output of my own duplication attempts: First, if I do a git fetch of the 'develop' branch (I can even nuke .git/FETCH_HEAD afterwards; it only matters that I have the history locally) from the place I'm pushing to, then the push size is tiny as expected: <Go to server, use rsync to make a pristine copy of repo> <Go to server, rsync repo from pristine state, then on my laptop:> $ git fetch git_over_ssh_url develop $ rm .git/FETCH_HEAD $ time git push git_over_ssh_url mike-push:refs/for/develop Enumerating objects: 37, done. Counting objects: 100% (37/37), done. Delta compression using up to 8 threads Compressing objects: 100% (11/11), done. Writing objects: 100% (22/22), 2.54 KiB | 2.54 MiB/s, done. Total 22 (delta 7), reused 14 (delta 1), pack-reused 0 remote: Checking connectivity: 22, done. To git_over_ssh_url * [new branch] mike-push -> refs/for/develop real 0m13.044s user 0m0.202s sys 0m0.146s <Remove the pack downloaded from the earlier fetch of develop> <Without this step, I always get small pushes:> $ ls -rt .git/objects/pack/ | tail -n 2 | xargs -n 1 -IPATH rm .git/objects/pack/PATH <Go to server, rsync repo from pristine state, then on my laptop:> $ time git push git_over_ssh_url mike-push:refs/for/develop Enumerating objects: 40785, done. Counting objects: 100% (20543/20543), done. Delta compression using up to 8 threads Compressing objects: 100% (9032/9032), done. Writing objects: 100% (16685/16685), 6.27 MiB | 208.00 KiB/s, done. Total 16685 (delta 7103), reused 12412 (delta 3389), pack-reused 0 remote: Resolving deltas: 100% (7103/7103), completed with 1864 local objects. remote: Checking connectivity: 22, done. To git_over_ssh_url * [new branch] mike-push -> refs/for/develop real 0m59.703s user 0m3.139s sys 0m0.735s <Also, this is only slightly affected by pack.useSparse...> <Go to server, rsync repo from pristine state, then on my laptop:> $ time git -c pack.useSparse=false push git_over_ssh_url mike-push:refs/for/develop Enumerating objects: 39891, done. Counting objects: 100% (18953/18953), done. Delta compression using up to 8 threads Compressing objects: 100% (7687/7687), done. Writing objects: 100% (14991/14991), 4.85 MiB | 228.00 KiB/s, done. Total 14991 (delta 6665), reused 11301 (delta 3141), pack-reused 0 remote: Resolving deltas: 100% (6665/6665), completed with 1939 local objects. remote: Checking connectivity: 22, done. To git_over_ssh_url * [new branch] mike-push -> refs/for/develop real 0m49.295s user 0m2.362s sys 0m0.592s Finally, you'll note that when I was reproducing, things were a bit different than what Mike (the end user) was dealing with. I was using vanilla git-2.27.0. Also, on the server, I was using plain old git-over-ssh, with the server running git-2.19.0. However, I also tried it with gerrit (i.e. jgit) as the server and got identical numbers for enumerating objects, counting objects, compressing objects, and the size of the pushed data and the number of resolved deltas was within 1% of the git-over-ssh case. And that was also true both with and without pack.useSparse. Any ideas? Anything else I should try or provide data on?