Re: rather slow 'git repack' in 'blob:none' partial clones

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 12, 2021 at 2:37 PM SZEDER Gábor <szeder.dev@xxxxxxxxx> wrote:
>
> And a somewhat related issue: when the server doesn't support filters,
> then 'git clone --filter=...' prints a warning and proceeds to clone
> the full repo.  Reading ba95710a3b ({fetch,upload}-pack: support
> filter in protocol v2, 2018-05-03) this seems to be intentional and I
> tend to think that it makes sense (though I managed to overlook that
> warning twice today...  I surely wouldn't have overlooked a hard
> error, but that would perhaps be too harsh in this case, dunno).
> However, the resulting full clone is still marked as partial:
>
>   $ git clone --bare --filter=blob:none https://git.kernel.org/pub/scm/git/git.git git-not-really-partial.git
>   Cloning into bare repository 'git-not-really-partial.git'...
>   warning: filtering not recognized by server, ignoring
>   remote: Enumerating objects: 591, done.
>   remote: Counting objects: 100% (591/591), done.
>   remote: Compressing objects: 100% (293/293), done.
>   remote: Total 305662 (delta 372), reused 393 (delta 298), pack-reused 305071
>   Receiving objects: 100% (305662/305662), 96.83 MiB | 2.10 MiB/s, done.
>   Resolving deltas: 100% (228123/228123), done.
>   $ ls -l git-not-really-partial.git/objects/pack/
>   total 107568
>   -r--r--r-- 1 szeder szeder   8559608 Apr 12 21:13 pack-53f3ee0dfeaa8cea65c78473cd5904bf5ddfaa20.idx
>   -r--r--r-- 1 szeder szeder 101535430 Apr 12 21:13 pack-53f3ee0dfeaa8cea65c78473cd5904bf5ddfaa20.pack
>   -rw------- 1 szeder szeder     49012 Apr 12 21:13 pack-53f3ee0dfeaa8cea65c78473cd5904bf5ddfaa20.promisor
>   $ cat git-not-really-partial.git/config
>   [core]
>         repositoryformatversion = 1
>         filemode = true
>         bare = true
>   [remote "origin"]
>         url = https://git.kernel.org/pub/scm/git/git.git
>         promisor = true
>         partialclonefilter = blob:none

I ran into this same surprising behavior recently, too. I was adding
some automated testing to Bitbucket for partial clones and initially
tried to use whether the repository was configured with a partial
clone filter as one of my checks, only to find that even when filters
weren't supported it was still set. The only way I could find to
detect that a partial clone that was requested didn't actually happen
was to parse the git clone output and look for the warning.

>
> I wonder whether this is intentional, or that it is really the desired
> behavior, considering that 'gc/repack/fsck' still treat it as a
> partial clone, and, consequently, are affected by this slowness and
> much higher memory usage, and since the repo now contains a lot more
> objects than expected (all the blobs as well), they are much slower:
>
>   $ /usr/bin/time --format=elapsed: %E  max RSS: %Mk git -C git-not-really-partial.git/ gc
>   Enumerating objects: 305662, done.
>   Counting objects: 100% (305662/305662), done.
>   Delta compression using up to 4 threads
>   Compressing objects: 100% (75200/75200), done.
>   Writing objects: 100% (305662/305662), done.
>   Total 305662 (delta 228123), reused 305662 (delta 228123), pack-reused 0
>   Removing duplicate objects: 100% (256/256), done.
>   elapsed: 4:28.96  max RSS: 1985100k
>   # with Peff's patch above:
>   $ /usr/bin/time --format=elapsed: %E  max RSS: %Mk /home/szeder/src/git/bin-wrappers/git -C git-not-really-partial.git/ gc
>   Enumerating objects: 305662, done.
>   Counting objects: 100% (305662/305662), done.
>   Delta compression using up to 4 threads
>   Compressing objects: 100% (75200/75200), done.
>   Writing objects: 100% (305662/305662), done.
>   Total 305662 (delta 228123), reused 305662 (delta 228123), pack-reused 0
>   elapsed: 1:21.83  max RSS: 1959740k
>




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux