Re: ceph encoding optimization

Xinze Chi (信泽) <xmdxcxz@xxxxxxxxx> · Thu, 3 Dec 2015 17:17:51 +0800

I write a new patch about this
https://github.com/XinzeChi/ceph/commit/06eb471e463a4687e251273d0b5dfe170acbef2d

If you use __attribute__ ((packed)) after struct, we could encode many
struct member in a batch. There is not compatibility problem if we
keep the order of members defined in struct.

Wait for your comment.

2015-11-10 1:54 GMT+08:00 Milosz Tanski <milosz@xxxxxxxxx>:
> On Mon, Nov 9, 2015 at 10:24 AM, Sage Weil <sweil@xxxxxxxxxx> wrote:
>> On Mon, 9 Nov 2015, Gregory Farnum wrote:
>>> On Wed, Nov 4, 2015 at 7:07 AM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>>> > The problem with this approach is that the encoded versions need to be
>>> > platform-independent ? they are shared over the wire and written to
>>> > disks that might get transplanted to different machines. Apart from
>>> > padding bytes, we also need to worry about endianness of the machine,
>>> > etc. *And* we often mutate structures across versions in order to add
>>> > new abilities, relying on the encode-decode process to deal with any
>>> > changes to the system. How could we deal with that if just dumping the
>>> > raw memory?
>>> >
>>> > Now, maybe we could make these changes on some carefully-selected
>>> > structs, I'm not sure. But we'd need a way to pick them out, guarantee
>>> > that we aren't breaking interoperability concerns, etc; and it would
>>> > need to be something we can maintain as a group going forward. I'm not
>>> > sure how to satisfy those constraints without burning a little extra
>>> > CPU. :/
>>> > -Greg
>>>
>>> So it turns out we've actually had issues with this. Sage merged
>>> (wrote?) some little-endian-only optimizations to the cephx code that
>>> broke big-endian systems by doing a direct memcpy. Apparently our
>>> tests don't find these issues, which makes me even more nervous about
>>> taking that sort of optimization into the tree. :(
>>
>> I think the way to make this maintainable will be to
>>
>> 1) Find a clean approach with a simple #if or #ifdef condition for
>> little endian and/or architectures that can handle unaligned int pointer
>> access.
>>
>
> In C++ you can also do that with a template <bool Endian> or using
> std::enable_if. The upside is the same as the downside (depending on
> how you look at it). So it'll add to compile time checks (because it
> won't be discarded by the processor) and it'll take longer to build,
> but you get extra checks and the compiler will later discard the
> unused code.
>
> If you do that, it should be easier to write unit tests for the functionality.
>
>> 2) Maintain the parallel optimized implementation next to the generic
>> encode/decode in a way that makes it as easy as possible to make changes
>> and keep them in sync.
>>
>> 3) Optimize *only* the most recent encoding to minimize complexity.
>>
>> 4) Ensure that there is a set of encode/decode tests that verify they both
>> work, triggered by make check (so that a simple make check on a big
>> endian box will catch errors).  Ideally this'd be part of the
>> test/encoding/readable.sh so that we run it over the entire corpus of old
>> encodings..
>>
>>
>> sage
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Milosz Tanski
> CTO
> 16 East 34th Street, 15th floor
> New York, NY 10016
>
> p: 646-253-9055
> e: milosz@xxxxxxxxx

-- 
Regards,
Xinze Chi
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html