Re: Fw: Curiosity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2021-12-15 at 18:07:20, Junio C Hamano wrote:
> João Victor Bonfim  <JoaoVictorBonfim@xxxxxxxxxxxxxx> writes:
> > Since Git is almost used for everything at this point, is there
> > any intent on providing better support for non textual file types?
> > Why do I say this? Take this game mod which I follow as example ->
> > https://github.com/SolariusScorch/XComFiles <- whenever I clone it
> > Git takes a significant forever amount of time to download 452 MB
> > of files whose some part, from my perspective, isn't being delta
> > compressed like the text files are (since, whenever reading a log
> > of what changes were made, git creates and undoes modes for all
> > binary files, some of which only changed by a pixel from one
> > colour to another).
> 
> Our delta compression does not care whether the contents are text or
> binary, so if it is not compressed well, so it can be a sign that
> the contents are not compressible to begin with, at least with the
> xdelta binary-diff-patch engine we use.  Improvement designs,
> algorithms and patches are always welcome ;-)

To expand on this, if what you're storing is already compressed, like
Ogg Vorbis files or PNGs, like are found in that repository, then
generally they will not delta well.  This is also true of things like
Microsoft Office or OpenOffice documents, because they're essentially
Zip files.

The delta algorithm looks for similarities between files to compress
them.  If a file is already compressed using something like Deflate,
used in PNGs and Zip files, then even very similar files will generally
look very different, so deltification will generally be ineffective.

There are two main solutions to this.  One is to store your data
uncompressed in the repository and compress it as part of a build step.
This makes your checkouts larger, but it makes your repository smaller.

The other is to store them outside of the repository proper.  Some folks
use Git LFS for this, but you could also just store a manifest with file
names and secure hashes, plus a download location for a public server.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux