On Tue, Jun 14 2022, kylezhao(赵柯宇) wrote: > Hi All, > > thank you for reading my report. > > > How did we find out? > > The problem described in the title occurs on our git server. > Each git repositories have multiple replicas on our servers to increase git read performance, and the data synchronization method between these replicas is git push. > One day we found that the git push of a repository was significantly slow, and it took more than ten seconds to just create a new branch from an existing commit. > > How to reproduce the problem ? > > git version: 2.36.1 > > # /data/test/repo is a bare git repository which can reproduce the problem > $ cd /data/test/repo > > # number of refs > $ git show-ref | wc -l > 21134 > # pack information > $ ls objects/pack/ -hl > total 14G > -r--r--r-- 1 root root 43M Jun 14 04:16 pack-9a7fc024652645a632fb82a4ff26c3ddf4883eed.bitmap > -r--r--r-- 1 root root 169M Jun 14 04:15 pack-9a7fc024652645a632fb82a4ff26c3ddf4883eed.idx > -r--r--r-- 1 root root 14G Jun 14 04:14 pack-9a7fc024652645a632fb82a4ff26c3ddf4883eed.pack > > # objects information > $ git count-objects -v > count: 0 > size: 0 > in-pack: 5185141 > packs: 1 > size-pack: 13938704 > prune-packable: 0 > garbage: 0 > size-garbage: 0 > > # number of commits > $ git rev-list --all | wc -l > 955262 > > $ cp -r /data/test/repo /data/test/replica-1 > $ cp -r /data/test/repo /data/test/replica-2 > $ cd /data/test/replica-1 > > # create a branch from an existing commit > $ git update-ref refs/heads/b_1 43fa4721c61106583cd552da85da3bd84f0f9929 > $ git show-ref | grep 43fa4721c61106583cd552da85da3bd84f0f9929 > 43fa4721c61106583cd552da85da3bd84f0f9929 refs/heads/b_1 > > # number of commits of the ref > $ git rev-list refs/heads/b_1 | wc -l > 117836 > > # git push with bitmap > $ GIT_TRACE=1 git push file:///data/test/replica-2 refs/heads/b_1 > 04:19:07.654103 git.c:459 trace: built-in: git push file:///data/test/replica-2 refs/heads/b_1 > 04:19:07.690006 run-command.c:654 trace: run_command: unset GIT_DIR GIT_IMPLICIT_WORK_TREE GIT_PREFIX; 'git-receive-pack '\''/data/test/replica-2'\''' > 04:19:07.694339 git.c:459 trace: built-in: git receive-pack /data/test/replica-2 > 04:19:07.751814 run-command.c:654 trace: run_command: git pack-objects --all-progress-implied --revs --stdout --thin --delta-base-offset --progress > 04:19:07.754011 git.c:459 trace: built-in: git pack-objects --all-progress-implied --revs --stdout --thin --delta-base-offset --progress > Total 0 (delta 0), reused 0 (delta 0), pack-reused 0 > 04:19:20.304868 run-command.c:654 trace: run_command: > GIT_ALTERNATE_OBJECT_DIRECTORIES=/data/test/replica-2/./objects > GIT_OBJECT_DIRECTORY=/data/test/replica-2/./objects/tmp_objdir-incoming-CaCTHm > GIT_QUARANTINE_PATH > =/data/test/replica-2/./objects/tmp_objdir-incoming-CaCTHm git unpack-objects --pack_header=2,0 > remote: 04:19:20.306550 git.c:459 trace: built-in: git unpack-objects --pack_header=2,0 > 04:19:20.306903 run-command.c:654 trace: run_command: > GIT_ALTERNATE_OBJECT_DIRECTORIES=/data/test/replica-2/./objects > GIT_OBJECT_DIRECTORY=/data/test/replica-2/./objects/tmp_objdir-incoming-CaCTHm > GIT_QUARANTINE_PATH > =/data/test/replica-2/./objects/tmp_objdir-incoming-CaCTHm git rev-list --objects --stdin --not --all --quiet --alternate-refs '--progress=Checking connectivity' > remote: 04:19:20.308332 git.c:459 trace: built-in: git rev-list --objects --stdin --not --all --quiet --alternate-refs '--progress=Checking connectivity' > remote: 04:19:20.344031 run-command.c:654 trace: run_command: > unset GIT_ALTERNATE_OBJECT_DIRECTORIES GIT_DIR GIT_OBJECT_DIRECTORY > GIT_PREFIX; git --git-dir=/data/test/replica-2 for-each-ref > '--format=%(objectname)' > remote: 04:19:20.346359 git.c:459 trace: built-in: git for-each-ref '--format=%(objectname)' > 04:19:20.395511 run-command.c:654 trace: run_command: git gc --auto --quiet > remote: 04:19:20.397949 git.c:459 trace: built-in: git gc --auto --quiet > To file:///data/test/replica-2 > * [new branch] b_1 -> b_1 > > # reset replica-2 and remove bitmap > $ rm -rf /data/test/replica-2 > $ cp -r /data/test/repo /data/test/replica-2 > $ rm objects/pack/pack-9a7fc024652645a632fb82a4ff26c3ddf4883eed.bitmap > > > # git push without bitmap > $ GIT_TRACE=1 git push file:///data/test/replica-2 refs/heads/b_1 > 04:20:44.633590 git.c:459 trace: built-in: git push file:///data/test/replica-2 refs/heads/b_1 > 04:20:44.668908 run-command.c:654 trace: run_command: unset GIT_DIR GIT_IMPLICIT_WORK_TREE GIT_PREFIX; 'git-receive-pack '\''/data/test/replica-2'\''' > 04:20:44.673234 git.c:459 trace: built-in: git receive-pack /data/test/replica-2 > 04:20:44.720852 run-command.c:654 trace: run_command: git pack-objects --all-progress-implied --revs --stdout --thin --delta-base-offset --progress > 04:20:44.723100 git.c:459 trace: built-in: git pack-objects --all-progress-implied --revs --stdout --thin --delta-base-offset --progress > Total 0 (delta 0), reused 0 (delta 0), pack-reused 0 > 04:20:44.800298 run-command.c:654 trace: run_command: > GIT_ALTERNATE_OBJECT_DIRECTORIES=/data/test/replica-2/./objects > GIT_OBJECT_DIRECTORY=/data/test/replica-2/./objects/tmp_objdir-incoming-UOWY1E > GIT_QUARANTINE_PATH > =/data/test/replica-2/./objects/tmp_objdir-incoming-UOWY1E git unpack-objects --pack_header=2,0 > remote: 04:20:44.802056 git.c:459 trace: built-in: git unpack-objects --pack_header=2,0 > 04:20:44.802474 run-command.c:654 trace: run_command: > GIT_ALTERNATE_OBJECT_DIRECTORIES=/data/test/replica-2/./objects > GIT_OBJECT_DIRECTORY=/data/test/replica-2/./objects/tmp_objdir-incoming-UOWY1E > GIT_QUARANTINE_PATH > =/data/test/replica-2/./objects/tmp_objdir-incoming-UOWY1E git rev-list --objects --stdin --not --all --quiet --alternate-refs '--progress=Checking connectivity' > remote: 04:20:44.803930 git.c:459 trace: built-in: git rev-list --objects --stdin --not --all --quiet --alternate-refs '--progress=Checking connectivity' > remote: 04:20:44.834388 run-command.c:654 trace: run_command: > unset GIT_ALTERNATE_OBJECT_DIRECTORIES GIT_DIR GIT_OBJECT_DIRECTORY > GIT_PREFIX; git --git-dir=/data/test/replica-2 for-each-ref > '--format=%(objectname)' > remote: 04:20:44.836220 git.c:459 trace: built-in: git for-each-ref '--format=%(objectname)' > 04:20:44.884165 run-command.c:654 trace: run_command: git gc --auto --quiet > remote: 04:20:44.886108 git.c:459 trace: built-in: git gc --auto --quiet > To file:///data/test/replica-2 > * [new branch] b_1 -> b_1 > > > It can be seen from the above operations that git push is stuck in the git pack-objects process for about 13s for a long time. > After I deleted the bitmap, the whole git push completed in less than 1s. > > During testing, we found that not every git repository was significantly affected by bitmap. > This may be related to the number of objects in the git repository itself, the number of refs, and the sha1 pointed to by the pushed branch. > > We benefit from bitmap performance optimizations for git fetch and clone, but it seems that it affects the performance of git push. > > Maybe we can disable bitmap under the process of git push? > As far as I know, the number of "counting objects" represented during a git push is usually small relative to the entire repository. > Counting objects by building bitmaps in memory may take more time than before. > > Of course, it would be better if anyone has a better solution. This is a known issue, I think you've found the same problem discussed in these past threads: https://lore.kernel.org/git/38b99459158a45b1bea09037f3dd092d@xxxxxxxxxxxxxxxxxxxxxxxx/ https://lore.kernel.org/git/87zhoz8b9o.fsf@xxxxxxxxxxxxxxxxxxx/ The latter one in particular has a lot of extra details. The former also has the suggestion of a per-push bitmap configuration as a workaround. As your numbers show it's still an issue today, but those threads should help you if you're looking to dig further into the root cause. Aside from the underlying root causes it would be very nice to fix the progress code in that area, i.e. we "stall" on "Enumerating objects", which is just a matter of us not having a separate progress bar for the very expensive bitmap work we're doing.