Re: How to prevent Git from compressing certain files?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 22, 2009 at 09:49:31PM +0200, Dirk Süsserott wrote:

> Somewhere I read that Git isn't supposed to efficiently handle binary
> files. Of course, I don't want to merge my files, just store them with
> their history and git-push them to some "safe place".

Git handles binary files better than many systems. The downsides are:

  - you can't do file-level diffing and merging very well, for obvious
    reasons (though actually, git is better than most; it makes it easy
    to look at both sides individually and pick the one you want).

  - really _big_ files can give lousy performance. Git assumes single
    files can fit into memory, which means files in the gigabyte range
    (or hundreds of megabytes if your machine is old :) ) can be awful.
    It also means that things like inexact rename detection and finding
    delta candidates can be slow.

> I figured that pushing and git gc'ing both try to compress those files
> (or differences) really hard. Works great for "regular" files, but is
> pointless with jpegs.
> 
> Question: Is there a way to prevent Git from trying to compress certain
> files based on their extension?

There are actually two types of compression that git uses: delta
compression between similar objects in packs, and zlib compression of
loose objects and objects within packs.

You almost certainly don't want zlib compression on your jpegs, as they
are already compressed. You can turn off zlib compression entirely by
setting core.compression to 0. Unfortunately, this turns off compression
for _all_ objects, which means in a mixed-use repo you won't be
compressing your text (and even in a photos-only repo, you are not
compressing your commit messages).

Delta compression between two jpegs, or between two versions of a jpeg
where the image data itself was modified, is unlikely to be useful.
However, if you use EXIF metadata in the file, then you will save a lot
of space between versions with the same image data, but different
metadata. So it's worth leaving delta compression on in that case, and
probably turning it off otherwise.

As Jakub mentioned, you can use the delta gitattribute for just your
jpegs. You can also turn off deltas entirely by setting pack.window to
0, though you may be losing some benefit on your non-blob objects.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]