On Dec 14, 2006, at 14:46, Shawn Pearce wrote:
And yet I get good delta compression on a number of ZIP formatted files which don't get good additional zlib compression (<3%). Doing the above would cause those packfiles to explode to about 10x their current size.
Yes, that's because for zip files each file in the archive is compressed independently. Similar things might happen when checking in uncompressed tar files with JPG's. The question is whether you prefer bad time usage or bad space usage when handling large binary blobs. Maybe we should use a faster, less precise algorithm instead of giving up. Still, I think doing anything based on filename is a mistake. If we want to have a heuristic to prevent spending too much time on deltifying large compressed files, the heuristic should be based on content, not filename. Maybe we could some "magic" as used by the file(1) command that allows git to say a bit more about the content of blobs. This could be used both for ordering files during deltification and to determine wether to try deltification at all. -Geert - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html