Re: [BUG] `git push` sends unnecessary objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 13, 2023 at 11:59:35PM +0100, Javier Mora wrote:
> I came across this issue accidentally when trying to move a directory
> containing a very large file, and deleting another file in that
> directory while I was at it.
> It seems to be caused by `pack.useSparse=true` being the default since
> v2.27 (which I found out after spending quite a while manually
> bisecting and compiling git since I noticed that this didn't happen in
> v2.25; commit de3a864 introduces this regression).
> 
> * Expected:
>     Pushing a commit that moves a file without modifying it shouldn't
> require sending a blob object for that file, since the remote server
> already has that blob object.
> * Observed:
>     Pushing a commit that moves a directory containing a file and also
> adds/deletes other files in that directory will for some reason also
> send blobs for all the files in that directory, even the ones that
> were already in the remote.
> * Consequences:
>     This has a very big impact in push times for very small commits
> that just move around files, if those files are very big (I had this
> happen with a >100MB file over a problematic connection... yikes!)
> * Note:
>     The commit introducing the regression does warn about possible
> scenarios involving a special arrangement of exact copies across
> directories, but these are not "copies", I just moved a file, which
> seems like a rather common operation.
> 
> Code snippet for reproduction:
> ```
> mkdir TEST_git
> cd TEST_git
> 
> mkdir -p local remote/origin.git
> cd remote/origin.git
> git init --bare
> cd ../../local
> git init
> git remote add origin file://"${PWD%/*}"/remote/origin.git
> 
> mkdir zig
> for i in a b c d e; do
>     dd if=/dev/urandom of=zig/"$i" bs=1M count=1
> done
> git add .
> git commit -m 'Add big files'
> git push -u origin master
> #>> Writing objects: 100% (8/8), 5.00 MiB | 13.27 MiB/s, done.
> #^ makes sense: 1 commit + 2 trees (/ and /zig) + 5 files = 8;
> #  5 MiB in total for the 5x 1 MiB binary files
> 
> git mv zig zag
> git commit -m 'Move zig'
> git push
> #>> Writing objects: 100% (2/2), 233 bytes | 233.00 KiB/s, done.
> #^ makes sense: 1 commit + 1 tree (/ renames /zig to /zag) = 2;
> #  a,b,c,d,e objects already in remote
> 
> git mv zag zog
> touch zog/f
> git add zog/f
> git commit -m 'For great justice'
> git push
> #>> Writing objects: 100% (9/9), 5.00 MiB | 24.63 MiB/s, done.
> #^ It re-uploaded the 5x 1 MiB blobs
> #  even though remote already had them.
> ```
> 
> Note that the latter doesn't happen if I use `git -c pack.useSparse=false push`.

I can reproduce this regression on v2.42.0 (self-compiled) on my Debian
testing system.

Cc'ing Derrick and Junio.

Thanks for the report!

-- 
An old man doll... just what I always wanted! - Clara

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux