On 8/17/06, Nicolas Pitre <nico@xxxxxxx> wrote:
We're streaving for optimal data storage here so don't be afraid to use one of the available types for an "object stream" object. Because when you think of it, the deflating of multiple objects into a single zlib stream can be applied to all object types not only deltas. If ever deflating many blobs into one zlib stream is dimmed worth it then the encoding will already be ready for it. Also you can leverage existing code to write headers, etc.
Here are two more case that need to be accounted for in the packs. 1) If you zip something and it gets bigger. We need an entry that just stores the object without it being zipped. Zipping jpegs or mpegs will likely make them significantly bigger. Or does zlib like already detect this case and do a null compression? 2) If you delta something and the delta is bigger than the object being deltaed. The delta code should detect this and store the full object instead of the delta. Again jpegs and mpegs will trigger this. You may even want to say that the delta has to be smaller than 80% of the full object. Shawn is planning on looking into true dictionary based compression. That will generate even more types inside of a pack. If dictionary compression works out full text search can be added with a little more overhead. True dictionary based compression has the potential to go even smaller than the current 280MB. The optimal way to do this is for each pack to contain it's own dictionary. -- Jon Smirl jonsmirl@xxxxxxxxx - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html