On Tue, 19 Oct 2010, Colin McCabe wrote: > On Tue, Oct 19, 2010 at 9:26 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > > - SimpleMessenger::read_message() just needs to call > > ms_deliver_get_data_buffer() if data_len > 0. If it returns true, use the > > provided bufferptr. > > > > - The existing Objecter::read() method returns the tid, which librados > > can then store in a private map of tid -> bufferptrs. When it's > > ms_get_data_buffer() is called it can provide the bufferptr; no need to > > modify the Objecter at all! Just make sure to verify that the entry > > exists in the map before using the operator[] (use find()). > > If we want thread-local data, we should at least use > pthread_setspecific / pthread_getspecific. No need for tids and STL > maps. gettid() is a syscall and it will be slow. There's no pthread specific data or gettid() syscalls here. The 'tid' is a transaction id generated by the Objecter that matchers up replies to requests. > > The main corner case is this: if there is some osd cluster change, it's > > theoretically possible we could have two threads reading read replies to > > the same request off the wire at the same time. Even if we carefully > > define which reply "wins", we need to make sure there isn't a stray thread > > read/accessing the buffer after the read() call returns. > > I just don't see why we have to care about threads at all. The corner case is when we are reading two replies for the same request off of two different sockets from two different OSDs. If they are both reading directly into the user supplied buffer, we need to make sure they both stop before returning success to the user; after that point we're not longer allows to touch that memory. That will probably require some locking and non-blocking IO, which can be tricky since the pthread stuff doesn't usually interact well with select/poll. > I think I can propose a patch which solves this problem without rados > client-specific data, threads, tids, maps, thread-local data, or > changes to SimpleMessenger. SimpleMessenger will change no matter what, because the core problem that we're trying to solve is that it allocates a new buffer when a message is being read off the wire. See SimpleMessenger::read_message(). Somehow that needs to discover and use the user provided buffer instead. That will invariably involve a callback of some sort to find the user buffer. And some means of yanking the buffer away to deal with the corner issue above. > It will just involve modifying bufferlist > so that it can use a buffer that's passed to it, rather than > allocating its own. Then we need something like a freeze operation > that will stop it from trying to reallocate or resize the buffer. This > solution will have much more generality and probably will be useful > elsewhere too. That sounds useful, but not in this particular case... :/ sage