Johannes Sixt <j.sixt <at> viscovery.net> writes: > > Peter Krefting schrieb: > > Since OpenOffice doucuments are just zipped xml files, I wondered how > > difficult it would be to create some hooks/hack git to track the files > > inside the archives instead? > > You could write a "clean" filter that "recompresses" the archive with > level 0 upon git-add. > A couple of notes: 1) For Openoffice documents whose size is dominated by embed images and other large objects, the git delta mechanism already performs reasonably well, since OO files are Zip archives where each file is compressed separately. If you do not change an image, then that image remains stored in the same way and the delta can be done. 2) For OO documents whose size is dominated by plain content, the git delta mechanism cannot work, since the zip compression introduces "mixing" and a small change in the document is converted into a very large change in the zip file. It could be possible to write a clean filter to uncompress before commit. However there is a trick with the complementary smudge filter to be used at checkout. If you do not smudge properly, git always shows the file as changed wrt the index. Smudging correctly would mean using the very same compression ratio and compress method that OO uses, which can be a little tricky. I have tried using the zip binary both in the clean and the smudge phases and it does not work nicely. The smudged file is always different from the original one. One should probably work at a lower level to have a finer control on what is happening (libzip) and prepend to the uncompressed file the compression parameters to be restored on smudging. The bigger issue is however that the clean/smudge thing can be really slow when dealing with large OO files. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html