On Wed, Jan 17, 2018 at 9:13 PM, Matt Benjamin <mbenjami@xxxxxxxxxx> wrote: > On Wed, Jan 17, 2018 at 7:53 AM, kefu chai <tchaikov@xxxxxxxxx> wrote: >> >> i think it could be feasible to come up with some generic adapter to >> use these concepts, but we might need to focus on some typical use >> cases. for example, to encode and write a MOSDOpReply: >> >> assuming that we want to encode it into multiple temporary buffers and >> send them using a seastar::packet. the MOSDOpReply struct has a field >> of vector<OSDOp>, and OSDOp in turn has "bufferlist outdata;" in it. >> for a read op, apparently, we want to avoid memcpy. if the underlying >> read(store) returns a bufferlist, the encoder should be able to >> shallow copy the bufferlist and encapsulate it in a fragment with a >> customized deleter or just a temporary_buffer. if we want to avoid the >> conversion from buffer::list to seastar::packet, probably we should >> generalize the appender so it > > Yes, we'd like to avoid such copies. > >> >> - is able to manage the life cycle of the buffers in bufferlist being >> encoded. i.e. should flush current temporary_buffer, and translate all >> buffer::ptr to seastar::temporary_buffer, then append them to the >> output seastar::packet >> - just writes to the local temporary_buffer directly when encoding >> other kinds of objects. >> >> so, i don't think we can have a very generic appender if we want to be >> efficient. but we can define one which works for certain output type. >> in this example, it's seastar::packet. > > That sounds right to me. > >> >>> >>> DynamicBuffer::prepare() looks like a natural way to do the append part on >>> encode, but it returns a MutableBufferSequence so is not guaranteed to be >>> contiguous. Can you remind me where we really depend on contiguous buffers? >> >> IIRC, we don't really depend on contiguous buffers. but we do expect >> that the buffer we are manipulating is continuous, so we can have the >> opportunity to optimize the dencoder: >> >> - encoding: it's a noticeable overhead to check the underlying buffer >> boundary when writing to a buffer. that's why we introduced the >> appender to facilitate the encoding. see [1] > > This is kind of surprising. forgot to mention, preallocated buffer also avoid the possible overhead to allocate multiple new buffers when encoding progresses. > >> - decoding: for the similar reason, it's more performant to just read >> a contiguous buffer than reading from a segmented one. in other words, >> if we can ensure a the bufferlist being decoded has only a single raw >> buffer, why bother checking the boundary? > > Is that likely? If the decoding case is from seastar, isn't the > incoming data likely to be in the form of several mbufs? > >> >> if we want to encode object into the mutable buffers returned >> DynamicBuffer::prepare(), we need to check x.size() when writing to >> x.data() and move on to the next one if current buffer chunk is too >> small. >> >> -- >> [1] https://www.spinics.net/lists/ceph-devel/msg32608.html >> >> >>> Is it just that some types set need_contiguous=false because we haven't yet >>> implemented logic for segmented buffers, or are there cases where >>> implementing that logic is actually intractable? >> >> "need_contiguous" trait was added to the denc for the case where the >> buffer being decoded are large objects and they could be segmented. >> these objects are read from bluestore. but i think we would expand the >> case to the large message if we rebuild/flatten the segmented buffer, >> the overhead of reallocating the buffer and memcpy calls offsets the >> performance improvement we achieve by avoiding the bound check and >> simpler logic. before that, we always do the rebuild before decoding a >> buffer::list::iterator. > > That's what I would have expected. > >> >>> >>> Thanks, >>> Casey >> >> >> >> -- >> Regards >> Kefu Chai > > > > -- > > Matt Benjamin > Red Hat, Inc. > 315 West Huron Street, Suite 140A > Ann Arbor, Michigan 48103 > > http://www.redhat.com/en/technologies/storage > > tel. 734-821-5101 > fax. 734-769-8938 > cel. 734-216-5309 -- Regards Kefu Chai -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html