Mike Hommey <mh <at> glandium.org> writes: > > A zipped file will be 100% different at each revision. > The unzipped counterpart may be similar for 90% or more between revisions. > > Mike > In my (modest) experience, not really: in fact, odf files are a zip collection of many individual files (for instance if you have an impress presentation, the zip collection will contain all the images that appear in the presentation...) Now: zip is different from .tar.gz in that tar.gz first concatenates the files and then compresses the overall thing, while zip compresses or stores the individual files and then concatenates and indexes the result. The difference is that in a tar.gz file, changing a single byte in one of the internal files can lead to a completely different compressed stream, while in a zip file, changing an internal file only affects the relevant part of the zipped file. This means that: - if you have an odf document containing lots of internal objects (e.g. images) that do not change very much from version to version, git can make very good deltas. - conversely if you have an odf document whose size is dominated by proper content, then git will not be able to make good deltas. As an example, I am finding that impress presentations (dominated by images) can delta very well, while calc spreadsheets (dominated by content) do not. Probably it could be nice to make a filter that takes an odf file and re-zips it so that the content.xml inner file is only stored, rather than deflated. Then this could be used with the git file filtering machinery. Sergio - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html