Re: [PATCH] Add --no-reuse-delta option to git-gc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano wrote:
I think that sounds saner and more user friendly than specific
knob to tune "window", "depth" and friends which are too
technical.  It has an added attraction that we can redefine what
exactly "hard" means later.

On that note, has any thought been given to looking at other compression algorithms? Gzip is a great high-speed compressor, but there are others out there (some a bit slower, some much slower at both compression and decompression) that produce substantially smaller output.

One could even, if one were in a particularly twisted state of mind, envision using CPU-intensive compression for less frequently-accessed objects and using gzip for active ones, on the theory that the best time/space tradeoff is not uniform across all the objects in a git repository. Presumably most of us never actually unpack the vast majority of objects in a git repository of reasonable age, so the fact that it'd take a little longer if we *did* want to unpack them isn't much of a downside compared to the upside of reclaiming disk space. That would mitigate the impact of using an algorithm that's slow at decompression.

I think it'd be kind of neat to have my .git directory shrink by another 20+%. That's conservative; on maximumcompression.com's test of a mix of different file types including images, gzip compresses 64% and the best-scoring one does 80%. On English text gzip does 71% and the top scorer does 89%. Most of the top-tier compressors are proprietary, but there are some open-source ones that do pretty well.

Maybe not worth the added complexity, but I thought I'd toss it out there. It probably makes more sense (if it makes any at all) after Linus's suggestion to not unpack after cloning is in place. Once the upstream has gone to the trouble of CPU-intensive compressing, you certainly don't want to force clones to have to spend the time repeating the same work.

-Steve (who suspects this is a "yes, we talked this over early in git's history" question, but what the heck)
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux