Re: on-wire compression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 3, 2021 at 11:57 AM Sage Weil <sage@xxxxxxxxxxxx> wrote:
>
> Hi everyone,
>
> During CDM today Ilya pointed out that there is an open pull request
> that adds on-wire compression to msgr v2 here:
>
>   https://github.com/ceph/ceph/pull/36517
>
> Before we proceed there, though, we decided we should have a broader
> discussion about how on-wire compression should be implemented.
>
> The current pull request implements this purely in the msgr layer.
> Benefits include that it applies to all messages--not just OSD
> replication but also client/OSD traffic, inter-MDS traffic, and so on.
> Downsides include that replicated writes are compressed multiple
> times--once for each replica.
>
> One alternate approach might be:
>  - expand the Message interface to allow set_data()/get_data() to
> accept/expose compressed data (e.g., data + compression_disposition).
> The message header could include a field indicating what codec was
> used.
> - OSD replication code could compress the data once and pass it to
> both messages for both replicas
> This would only capture the data portion of the message payload, but
> that is probably the only part we really are about.  It would also
> require some special support for all the users that want to take
> advantage of it... probably the osd replication backend and Objecter
> to start.  One could also imagine extending this to allow compressed
> data to pass all the way through to bluestore, although that brings in
> some additional concerns (bluestore has a max chunk size and some
> alignment considerations, for instance).
>
> Another possibility is integration compression into bufferlist.  I'm
> not sure that represents a very compelling set of trade-offs, however.
>
> Other thoughts?

These interfaces seem a lot more complicated and require a lot of
management outside the messenger — besides simply dealing with
buffers, we need to handle negotiating compression techniques and
keeping them uniform across servers running different versions. That
sounds like a lot of pain to me.

If the main concern is re-compressing replicated OSD data, and given
that we're using bufferlists and bufferptrs to share that data amongst
the messages anyway, perhaps we should just do memoization on those
data structures when we compres?
-Greg

>
> sage
> _______________________________________________
> Dev mailing list -- dev@xxxxxxx
> To unsubscribe send an email to dev-leave@xxxxxxx
>
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx




[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux