On Thu, 17 Aug 2006, Shawn Pearce wrote: > I'm going to try to integrate this into core GIT this weekend. > My current idea is to make use of the OBJ_EXT type flag to add > an extended header field behind the length which describes the > "chunk" as being a delta chain compressed in one zlib stream. > I'm not overly concerned about saving lots of space in the header > here as it looks like we're winning a huge amount of pack space, > so the extended header will probably itself be a couple of bytes. > This keeps the shorter reserved types free for other great ideas. :) We're streaving for optimal data storage here so don't be afraid to use one of the available types for an "object stream" object. Because when you think of it, the deflating of multiple objects into a single zlib stream can be applied to all object types not only deltas. If ever deflating many blobs into one zlib stream is dimmed worth it then the encoding will already be ready for it. Also you can leverage existing code to write headers, etc. I'd suggest you use OBJ_GROUP = 0 as a new primary object type. Then the "size" field in the header could then become the number of objects that are included in the group. Most of the time that will fit in the low 4 bits of the first header byte, but if there is more than 15 grouped objects then more bits can be used on the following byte. Anyway so far all the code to generate and parse that is already there. If ever there is a need for more extensions that could be prefixed with a pure zero byte (an object group with a zero object count which is distinguishable from a real group). Then, having the number of grouped objects, you just have to list the usual headers for those objects, which are their type and inflated size just like regular object headers, including the base sha1 for deltas. Again you already have code to produce and parse those. And finally just append the objects payload in a single deflated stream. This way the reading of an object from a group can be optimized if the object data is located at the beginning of the stream such that you only need to inflate the amount of bytes leading to the desired data (possibly caching those for further delta replaying), inflate the needed data for the desired object and then ignoring the remaining of the stream. Nicolas - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html