Re: bufferlist-free seastar osd?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 30, 2018 at 7:02 AM Casey Bodley <cbodley@xxxxxxxxxx> wrote:
>
>
> On 11/29/18 7:17 PM, Gregory Farnum wrote:
> >> On Thu, Nov 29, 2018 at 1:46 PM Casey Bodley <cbodley@xxxxxxxxxx> wrote:
> >>> Bufferlist is a common theme in Mark's weekly performance meetings, and
> >>> today's was no exception. Radoslaw highlighted the cost of atomic ref
> >>> counting associated with sharing buffers, and there was broad agreement
> >>> that the seastar osd's data path should avoid using bufferlist entirely
> >>> for that reason.
> >>>
> >>> However, one choice that I made in the design of the seastar messenger
> >>> was to use Ceph's existing bufferlist-based Message types so that their
> >>> encode/decode wouldn't diverge between the seastar osd and its librados
> >>> clients and peers like ceph-mon/mgr.
> >>>
> >>> In order to create a bufferlist-free data path, we'll need to find some
> >>> abstraction that allows Message types to define a serialization format
> >>> that can work both for bufferlist and for seastar's native buffer types
> >>> - whether that covers all Message types, or just some subset of
> >>> important ones like MOSDOp/Reply.
> >>>
> >>> Any thoughts on what this could look like?
> >>>
> >>> Casey
> > On Thu, Nov 29, 2018 at 2:32 PM Noah Watkins <nwatkins@xxxxxxxxxx> wrote:
> >> Potentially stupid question... if there exists a buffer container that
> >> can work in the new data path _without reference counting_, why not
> >> create a bufferlist super class that doesn't do reference counting?
> > Or even just let seastar wrap its own buffers in a new buffer ptr type
> > and generate a new bufferlist when it passes that data into the
> > non-seastar-ized code?
> > -Greg
>
> Yeah, that's essentially what the messenger does now - it reads
> seastar's native buffers off the network, wraps them in a buffer::raw
> type, and passes them as bufferlists to ::decode_message(). There's a
> similar conversion from bufferlist to seastar::net::packet for the
> send() side.
>
> So the challenge is to find a way to decode messages from their native
> buffers, avoiding conversions to bufferlist until we need to pass them
> to common code that expects bufferlist. Ideally, the data path won't
> require any of these conversions.

So while the literal encode/decode functions in a given Message are
fixed type, the actual logic of them pretty friendly to being swapped
out with macros thanks to ENCODE_START etc. I haven’t thought about it
deeply but I presume it would be pretty simple to extend the message
types to add seastar-buffer encodes with different buffer types but
otherwise the same signature, and to compile the individual Message’s
encode/decode functions with both as either macros or templates?

Then for the rare message types that need to convert their underlying
data into the other buffer format, we can do that either explicitly at
the handoff boundaries or implicitly when a function asks for the data
in a particular format.

Am I missing something that makes this particularly difficult?
-Greg



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux