On Mon, Jan 16, 2012 at 3:46 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > When we can't handle a change with a compatible encoding change, we can > introduce a feature bits and conditionally encode old formats for old > peers. This is just more work and eats into a more limited feature bit > space. I'd like to hear a lot more about this — it's the real problem. I think we've already got one example in the codebase of encoding for backwards-compatibility, and it's a huge mess. Could we construct reasonable macros/functions that take in compat bits from the daemon we're targeting and then select the appropriate encoder based on those, for instance? It certainly shouldn't be hacked together on a case-by-case basis because we're going to start running into a lot more of these things as we start testing for forwards-and-backwards compatibility. > There are two things going on in this branch. The first part is a > 'standard' way of constructing the encode/decode functions to facilitate > backward and forward compatibility and incompatibility detection. The > basic scheme is: > > 1 byte - version of this encoding > 1 byte - incompat version.. the oldest code version we expect to be able > to decode this > 4 bytes - length of payload > ... data ... > > In general, when we decode, we verify that the incompat version is <= our > (code) version. If not, we throw an exception. Then we decode the > payload, using the version for any conditionals we need to (e.g. to skip > newly added fields, etc.). We skip any data at the end. > > When we revise an encoding, we should add new fields at the end, and in > general leave old fields in place, ideally with values that won't confuse > old code. This sounds good to me! > To make this painless, there are a few new macros to do the encode/decode > boilerplate. If the encode/decode functions we originally > > void pool_snap_info_t::encode(bufferlist& bl) const > { > __u8 struct_v = 1; > ::encode(struct_v, bl); > ::encode(snapid, bl); > ::encode(stamp, bl); > ::encode(name, bl); > } > > void pool_snap_info_t::decode(bufferlist::iterator& bl) > { > __u8 struct_v; > ::decode(struct_v, bl); > ::decode(snapid, bl); > ::decode(stamp, bl); > ::decode(name, bl); > } > > then we would revise them to be > > void pool_snap_info_t::encode(bufferlist& bl) const > { > ENCODE_START(2, 2, bl); > > New version is 2. v1 code can't decode this, so the second argument > (incompat) is also 2. > > ::encode(snapid, bl); > ::encode(stamp, bl); > ::encode(name, bl); > ::encode(new_thing, bl); > ENCODE_FINISH(); > } > > void pool_snap_info_t::decode(bufferlist::iterator& bl) > { > DECODE_START_LEGACY(2, bl, 2); > > We can still decode v1 code, but it doesn't have the (new) length and > incompat version fields, so use the _LEGACY macro. The second 2 means we > started using the new approach with v2. Is there a DECODE_START (non-legacy) for new structs? > This requires and initial incompat change to add the length and incompat > fields, but then we can generally add things without breakage. Hmm...if we come up with something that talks feature bits we could do this nicely with a compat bit. :) > The second question is how to test compatibility between different > versions of code. There are a few parts to this. > > First, a ceph-dencoder tool is compiled for each version of the code that > is able to encode, decode, and dump (in json) whatever structures we > support. It works something like this: > > ceph-dencoder object_info_t -i inputfile decode dump_json > > to read in encoded data, decode it, and dump it into json. We can do a > trivial identity check (that decode of encode matches) with > > ceph-dencoder object_info_t -i inputfile decode dump_json > /tmp/a > ceph-dencoder object_info_t -i inputfile decode encode decode dump_json > /tmp/b > cmp /tmp/a tmp/b > > Obviously that should always pass. For testing cross-version encoding, we > need a ceph-dencoder and a corpus of objects encoded for each version. > Assuming you have that, you can (a) make sure we can decode anything from > other versions without crashing, (b) compare the dumps between version and > whitelist changes (e.g., when fields are added or removed). You can also > specify feature bits to test encoding for older versions, and take, say > everything for the v0.42 corpus, encode for the v0.40 feature bits, and > verify that the 0.40 version of ceph-dencoder can handle it. And > verify+whitelist diffs. Awesome! > How to build the per-version corpus? We can write unit tests that > explicitly generate interesting object instances. That's tedious and time > consuming, but probably best, since the developer knows what corner cases > are interesting. If we've got a single tool that can handle all the structs we can presumably encode each struct from a given version? Obviously constructing actual state would require more work but just checking the default constructor version of each struct would go a long, loooong way toward automating encode/decode checks. -off to review the actual code, Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html