Re: Delta compression not so effective

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/1/2017 14:19, Martin Langhoff wrote:
On Wed, Mar 1, 2017 at 8:51 AM, Marius Storm-Olsen <mstormo@xxxxxxxxx> wrote:
BUT, even still, I would expect Git's delta compression to be quite effective, compared to the compression present in SVN.

jar files are zipfiles. They don't delta in any useful form, and in
fact they differ even if they contain identical binary files inside.

If you look through the initial post, you'll see that the jar in question is in fact a tool (BFG) by Roberto Tyley, which is basically git filter-branch on steroids. I used it to quickly filter out the extern/ folder, just to prove most of the original size stems from that particular folder. That's all.

The repo does not contain zip or jar files. A few images and other compressed formats (except a few 100MBs of proprietary files, which never change), but nothing unusual.


    Commits: 32988
    DB (server) size: 139GB

Are you certain of the on-disk storage at the SVN server? Ideally,
you've taken the size with a low-level tool like `du -sh
/path/to/SVNRoot`.

139GB is from 'du -sh' on the SVN server. I imported (via SubGit) directly from the (hotcopied) SVN folder on the server. So true SVN size.


Even with no delta compression (as per Junio and Linus' discussion),
based on past experience importing jar/wars/binaries from SVN into
git... I'd expect git's worst case to be on-par with SVN, perhaps ~5%
larger due to compression headers on uncompressible data.

Yes, I was expecting a Git repo <139GB, but like Linus mentioned, something must be knocking the delta search off its feet, so it bails out. Loose object -> 'hard' repack didn't show that much difference.


Thanks!

--
.marius



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]