On 3/1/2017 14:19, Martin Langhoff wrote:
On Wed, Mar 1, 2017 at 8:51 AM, Marius Storm-Olsen <mstormo@xxxxxxxxx> wrote:
BUT, even still, I would expect Git's delta compression to be quite effective, compared to the compression present in SVN.
jar files are zipfiles. They don't delta in any useful form, and in
fact they differ even if they contain identical binary files inside.
If you look through the initial post, you'll see that the jar in
question is in fact a tool (BFG) by Roberto Tyley, which is basically
git filter-branch on steroids. I used it to quickly filter out the
extern/ folder, just to prove most of the original size stems from that
particular folder. That's all.
The repo does not contain zip or jar files. A few images and other
compressed formats (except a few 100MBs of proprietary files, which
never change), but nothing unusual.
Commits: 32988
DB (server) size: 139GB
Are you certain of the on-disk storage at the SVN server? Ideally,
you've taken the size with a low-level tool like `du -sh
/path/to/SVNRoot`.
139GB is from 'du -sh' on the SVN server. I imported (via SubGit)
directly from the (hotcopied) SVN folder on the server. So true SVN size.
Even with no delta compression (as per Junio and Linus' discussion),
based on past experience importing jar/wars/binaries from SVN into
git... I'd expect git's worst case to be on-par with SVN, perhaps ~5%
larger due to compression headers on uncompressible data.
Yes, I was expecting a Git repo <139GB, but like Linus mentioned,
something must be knocking the delta search off its feet, so it bails
out. Loose object -> 'hard' repack didn't show that much difference.
Thanks!
--
.marius