wip-encoding

Sage Weil <sage@xxxxxxxxxxxx> · Mon, 16 Jan 2012 15:46:45 -0800 (PST)

There are two things going on in this branch.  The first part is a 
'standard' way of constructing the encode/decode functions to facilitate 
backward and forward compatibility and incompatibility detection.  The 
basic scheme is:

  1 byte - version of this encoding
  1 byte - incompat version.. the oldest code version we expect to be able 
           to decode this
  4 bytes - length of payload
  ... data ...

In general, when we decode, we verify that the incompat version is <= our 
(code) version.  If not, we throw an exception.  Then we decode the 
payload, using the version for any conditionals we need to (e.g. to skip 
newly added fields, etc.).  We skip any data at the end.

When we revise an encoding, we should add new fields at the end, and in 
general leave old fields in place, ideally with values that won't confuse 
old code.  When that doesn't work, we'll eventually need to bump the 
incompat version to write off old code.  This generally isn't a problem if 
people are rolling forward frequently.  Only users to make big jumps will 
have trouble having daemons with different versions interact (at least 
when it comes to encoding; if protocols change that's another matter).

When we can't handle a change with a compatible encoding change, we can 
introduce a feature bits and conditionally encode old formats for old 
peers.  This is just more work and eats into a more limited feature bit 
space.

To make this painless, there are a few new macros to do the encode/decode 
boilerplate.  If the encode/decode functions we originally

 void pool_snap_info_t::encode(bufferlist& bl) const
 {
   __u8 struct_v = 1;
   ::encode(struct_v, bl);
   ::encode(snapid, bl);
   ::encode(stamp, bl);
   ::encode(name, bl);
 }

 void pool_snap_info_t::decode(bufferlist::iterator& bl)
 {
   __u8 struct_v;
   ::decode(struct_v, bl);
   ::decode(snapid, bl);
   ::decode(stamp, bl);
   ::decode(name, bl);
 }

then we would revise them to be

 void pool_snap_info_t::encode(bufferlist& bl) const
 {
   ENCODE_START(2, 2, bl);

New version is 2.  v1 code can't decode this, so the second argument 
(incompat) is also 2.

   ::encode(snapid, bl);
   ::encode(stamp, bl);
   ::encode(name, bl);
   ::encode(new_thing, bl);
   ENCODE_FINISH();
 }

 void pool_snap_info_t::decode(bufferlist::iterator& bl)
 {
   DECODE_START_LEGACY(2, bl, 2);

We can still decode v1 code, but it doesn't have the (new) length and 
incompat version fields, so use the _LEGACY macro.  The second 2 means we 
started using the new approach with v2.

   ::decode(snapid, bl);
   ::decode(stamp, bl);
   ::decode(name, bl);
   if (struct_v >= 2)
     ::decode(new_thing, bl);
   DECODE_FINISH();
 }

This requires and initial incompat change to add the length and incompat 
fields, but then we can generally add things without breakage.

----

The second question is how to test compatibility between different 
versions of code.  There are a few parts to this.

First, a ceph-dencoder tool is compiled for each version of the code that 
is able to encode, decode, and dump (in json) whatever structures we 
support.  It works something like this:

  ceph-dencoder object_info_t -i inputfile decode dump_json

to read in encoded data, decode it, and dump it into json.  We can do a 
trivial identity check (that decode of encode matches) with

  ceph-dencoder object_info_t -i inputfile decode dump_json > /tmp/a
  ceph-dencoder object_info_t -i inputfile decode encode decode dump_json > /tmp/b
  cmp /tmp/a tmp/b

Obviously that should always pass.  For testing cross-version encoding, we 
need a ceph-dencoder and a corpus of objects encoded for each version.  
Assuming you have that, you can (a) make sure we can decode anything from 
other versions without crashing, (b) compare the dumps between version and 
whitelist changes (e.g., when fields are added or removed).  You can also 
specify feature bits to test encoding for older versions, and take, say 
everything for the v0.42 corpus, encode for the v0.40 feature bits, and 
verify that the 0.40 version of ceph-dencoder can handle it.  And 
verify+whitelist diffs.

How to build the per-version corpus?  We can write unit tests that 
explicitly generate interesting object instances.  That's tedious and time 
consuming, but probably best, since the developer knows what corner cases 
are interesting.

Alternatively/additionally, a patch in wip-encoding instruments the 
encode() wrapper to dump a sample of all encoded objects to a temporary 
directory.  This lets you run the system for a while and quickly generate 
a body of encoded objects that can feed the verification process above.  
Some moderate human attention can pick a sample of those, or we can 
randomly take the biggest, smallest, and something in between.. whatever 
seems appropriate.

Current status:
 - The ceph-dencoder tool works.
 - Capturing encoded objects works.
 - New encode/decode macros are there.
 - A simple shell script does some basic identity checks (decode of encode 
   is the same).

Still need:
 - how to structure corpus
 - scripts to do cross-version validation
 - process for whitelisting differences
 - some slightly special handline for Message, which doesn't use the 
   standard encode/decode wrapper functions.

I'm hoping for something that is relatively robust and also mostly 
painless.  In particular, it would be nice to get some decent coverage 
without a bit initial investment.  Currently, we just need to write dump() 
functions and then add types to src/test/encoding/types.h.

Thoughts on this approach?  

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html