Re: Git and OpenDocument (OpenOffice.org) files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Mike Hommey <mh <at> glandium.org> writes:


> 
> A zipped file will be 100% different at each revision.
> The unzipped counterpart may be similar for 90% or more between revisions.
> 
> Mike
> 

In my (modest) experience, not really:

in fact, odf files are a zip collection of many individual files (for instance
if you have an impress presentation, the zip collection will contain all
the images that appear in the presentation...)

Now: zip is different from .tar.gz in that tar.gz first concatenates the
files and then compresses the overall thing, while zip compresses or stores
the individual files and then concatenates and indexes the result.

The difference is that in a tar.gz file, changing a single byte in one of
the internal files can lead to a completely different compressed stream,
while in a zip file, changing an internal file only affects the relevant
part of the zipped file.

This means that:
- if you have an odf document containing lots of internal objects (e.g.
images) that do not change very much from version to version, git can make
very good deltas.
- conversely if you have an odf document whose size is dominated by proper
content, then git will not be able to make good deltas.

As an example, I am finding that impress presentations (dominated by images)
can delta very well, while calc spreadsheets (dominated by content) do not.

Probably it could be nice to make a filter that takes an odf file and 
re-zips it so that the content.xml inner file is only stored, rather
than deflated.  Then this could be used with the git file filtering
machinery.

Sergio


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux