On Thu, Oct 27, 2011 at 23:04, Junio C Hamano <gitster@xxxxxxxxx> wrote: > In addition to four basic types (commit, tree, blob and tag), the pack > stream can encode a few other "representation" types, such as REF_DELTA > and OFS_DELTA. As we allocate 3 bits in the first byte for this purpose, > we do not have much room to add new representation types in place, but we > do have one value reserved for future expansion. We have 2 values reserved, 0 and 5. > When bit 4-6 encodes type 5, the first byte is used this way: > > - Bit 0-3 denotes the real "extended" representation type. Because types > 0-7 can already be encoded without using the extended format, we can > offset the type by 8 (i.e. if bit 0-3 says 3, it means representation > type 11 = 3 + 8); > > - Bit 4-6 has the value "5"; > > - Bit 7 is used to signal if the _third_ byte needs to be read for larger > size that cannot be represented with 8-bit. This is very complicated. We don't need more complex logic in the pack encoding. We _especially_ do not need yet another variant of how to store a variable length integer in the pack file. I'm sorry, but we already have two different variants and this just adds a third. It is beyond crazy. Last time (this is now years ago but whatever) Nico and I discussed adding a new type to packs it was for the alternate tree encoding in "pack v4". Trees happen so often that type code 5 is a good value to use for these. Later you talked about using the extended type to store a cattree blob thing, which would not appear nearly as often as a normal directory listing type tree that was encoded using the pack v4 style encoding... I think saving type 5 for a small frequently occurring type is a good thing. > As it is unlikely for us to pack things that do not need to record any > size, the second byte is always used in full to encode the low 8-bit of > the size. > > I haven't started using type=8 and upwards for anything yet, but because > we have only one "future expansion" value left, I want us to be extremely > careful in order to avoid painting us into a corner that we cannot get out > of, so I am sending this out early for a preliminary review. I would have said something more like: When bit 4-6 encodes "0", then: - Bit 0-3 and bit 7 are used normally to encode a variable length "size" integer. These may be 0 indicating no size information. - 2nd-nth bytes store remaining "size" information, if bit 7 was set. - The immediate next byte encodes the extended type. This type is stored using the OFS_DELTA offset varint encoding, and thus may be larger than 256 if we ever need it to be. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html