On Tue, 19 Oct 2010, Colin McCabe wrote: > Hi Takuya, > > Thanks for looking at this! > > Unfortunately I think you have taken the wrong approach here. There > isn't any need for thread-local data in the Objecter. It's not threads > that the Objecter cares about, it's clients. Also, I don't think a > change like this needs to involve SimpleMessenger and all that. The problem Takuya is solving is that the messenger currently always allocates _new_ buffers for any data it reads off the network, including the data payload of read operations. His goal is to avoid later doing a memcpy into the user provided buffer. In principle, I think this is the right approach. There are a few things I think we should change with the implementation before merging anything: - We should make the fetch_buffer_func a Dispatcher method, like all of the existing ms_* callbacks. That avoids the single callback instance for the whole SimpleMessenger, and the weird constructor argument. A helper method Messenger::ms_deliver_get_data_buffer() will probably be needed too. This keeps the interface consistent, and allows any registered Dispatcher to specify buffers to read into for messages it expects. - The ms_get_data_buffer() should just take the ceph_msg_header* as an argument. That has a tid in it _and_ the message type--everything we might need to determine which message (reply) we are receiving and thus which buffer we should read into. I pushed a msgr_zerocopy_read branch to ceph.git that implements this part (patch below). - SimpleMessenger::read_message() just needs to call ms_deliver_get_data_buffer() if data_len > 0. If it returns true, use the provided bufferptr. - The existing Objecter::read() method returns the tid, which librados can then store in a private map of tid -> bufferptrs. When it's ms_get_data_buffer() is called it can provide the bufferptr; no need to modify the Objecter at all! Just make sure to verify that the entry exists in the map before using the operator[] (use find()). The main corner case is this: if there is some osd cluster change, it's theoretically possible we could have two threads reading read replies to the same request off the wire at the same time. Even if we carefully define which reply "wins", we need to make sure there isn't a stray thread read/accessing the buffer after the read() call returns. I'm not sure what the solution to that one is yet. The kclient doesn't have separate threads, and takes pains to yank the pages out from under incoming message; that approach doesn't work well with the SimpleMessenger threading model. :/ sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html