On Thu, 7 Sep 2006, linux@xxxxxxxxxxx wrote: > A few notes: > > > Re: base-128 encodings, it's a pet peeve of mine that meny people, even > while trying to save space, waste it by allowing redundant encodings. > The optimal way, assming msbit=1 means "more", is > > 0x00 -> 0 0x01 -> 1 > 0x7f -> 127 0x80 0x00 -> 128 > 0x80 0x7f -> 255 0x81 0x00 -> 256 > 0xfe 0x7f -> 16383 0xff 0x00 -> 16384 > 0xff 0x7f -> 16511 0x80 0x00 0x00 -> 16512 Indeed. But... Since we already use 3 bit of object type and that most objects are larger than 15 bytes this means with 2 bytes we have 11 bits or up to 2047. With your encoding that would mean 2175. So a byte would be saved only for objects whose size is between 2048 and 2175. I don't know what is the proportion of objects that fall into that range in the average pack, but even for those objects that means a reduction of less than 0.1% with an average deflate rate of 50%. And we can forget about cases where the size would require a fourth byte or more since saving a byte in those cases is even less significant. So I don't think we would gain that much using that encoding unless/until the pack format is made completely incompatible due to other changes, and that's something we should try to avoid as much as possible anyway. Nicolas - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html