Thanks Kefu, Will study seastar ptrs first. Still has a question, how to handle some data structure such as op seq, each consumer need read it and increase 1, seems the message-passing we discussed can't fit for this situation. Any idea on it? Thanks! -Chunmei > -----Original Message----- > From: kefu chai [mailto:tchaikov@xxxxxxxxx] > Sent: Wednesday, June 20, 2018 10:37 PM > To: Liu, Chunmei <chunmei.liu@xxxxxxxxx> > Cc: Gregory Farnum <gfarnum@xxxxxxxxxx>; Sage Weil > <sage@xxxxxxxxxxxx>; Matt Benjamin <mbenjami@xxxxxxxxxx>; Kefu Chai > <kchai@xxxxxxxxxx>; The Esoteric Order of the Squid Cybernetic <ceph- > devel@xxxxxxxxxxxxxxx> > Subject: Re: using RCU to replace Locker in config for seastar version > > On Wed, Jun 13, 2018 at 3:39 AM Liu, Chunmei <chunmei.liu@xxxxxxxxx> wrote: > > > > Hi Greg, > > > > I still has some questions, please see below. > > > > -----Original Message----- > > From: Gregory Farnum [mailto:gfarnum@xxxxxxxxxx] > > Sent: Sunday, June 10, 2018 10:58 AM > > To: Sage Weil <sage@xxxxxxxxxxxx> > > Cc: kefu chai <tchaikov@xxxxxxxxx>; Liu, Chunmei > > <chunmei.liu@xxxxxxxxx>; Matt Benjamin <mbenjami@xxxxxxxxxx>; Kefu > > Chai <kchai@xxxxxxxxxx>; The Esoteric Order of the Squid Cybernetic > > <ceph-devel@xxxxxxxxxxxxxxx> > > Subject: Re: using RCU to replace Locker in config for seastar version > > > > On Fri, Jun 8, 2018 at 5:29 PM, Liu, Chunmei <chunmei.liu@xxxxxxxxx> wrote: > > > Hi Greg, > > > > > > How to use message-passing? each core maintain a local replication copy > of data structure and use message-passing to inform other cores update its own > local copy. Or only one core can access data structure, the other cores should > get shared data structure through this core? > > > > Just as a first pass, in the case of the config structure it might be something > like: > > 1) Create new config struct in memory on "server" core > > 2) Use the "sharded_shared_ptr" I'll discuss below to give each core a > > reference to it > > 3) Send a message to the cores telling them this has happened > > 4) At a later time, clean up the previous config structure when all cores drop > their refcounts to zero. > > > > [liucm] you said clean up the previous config structure, does it mean when > modification happen, we need copy the data structure then update it? > > [liucm] local refcount means this core has users access the data structure, > global atomic refcount means there are cores access the data structure, right? > > [liucm] you said all cores drop their refcounts to zero, so it is local refcount, > how does server cores know it? Local core send message to server or local core > itself know it is enough? > > local core will send a message to the owner core. > > > [liucm] if server core (or local core ?) check a core local refcount decrease to > zero, server core (or local core ?) decrease atomic global refcount? Which core > do this work? > > i think i've explained this in the previous reply. it'd be the server > (owner) core who checks the local refcount. > > > [liucm] server core will check until global refcount to be zero then update the > data structure pointer to the new copy? How to monitor the global refcount to > decrease to zero? > > i encourage you to refer to foreign_ptr<> in seastar. > > > > > > > Now, that looks an awful lot like RCU, which makes sense since it's a useful > basic algorithm. But we're avoiding trying to properly track accesses via a library > like liburcu that's been referenced. I like that both because it limits the number > paradigms a Ceph developer needs to be able to work with, and also because > we've prototyped using liburcu before and found it made things *slower*. > > We can do something similar for the osd_map_cache, where local threads > keep their own map of epochs to pointers, with local integer ref counts, and > drop the global atomic count when the thread drops all users. > > > > > > On Sat, Jun 9, 2018 at 12:16 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > > >> > > When I went through the data structures, it generally seemed > > >> > > like message-passing about data structure changes would be a > > >> > > better way to go than trying to employ any kind of real RCU > > >> > > library (or even the exact abstractions). We might maintain > > >> > > local pointers to constant structures with a per-core ref count > > >> > > to protect deletion, but proper > > > > > > Is there already a per-core ref-counting foo_ptr<> that does this? > > > (This being a core/thread-local refcount, and a global atomic > > > refcount?) This seems useful in lots of places (probably most > > > places we use RefCountedObject now... things like OSDSession). > > > > > > Yeah, I don't think anything like this exists. But it'll be a useful tool, > *especially* when we start mixing in posix threads. > > > > Just to be clear, I'm thinking something like: > > > > class sharded_shared_pointer_owner<T> { > > int local_ref_count; > > root_pointer<T> { > > atomic_t ref_count; > > T *object; > > } > > root_pointer<T> *parent; > > } > > > > class sharded_shared_pointer<T> { > > sharded_shared_pointer_owner *parent; } > > > > Where copying the sharded_shared_pointer increments the local_ref_count, > and the sharded_shared_pointer_owner is used on copying between threads and > increments the root_pointer::ref_count. > > > > [liucm] I don't understand the above sentence, what you mean copying the > pointer here? Can you give a detail example? > > it's all about the semantics and the implementation of a typical smart_ptr<>. for > instance, the copy constructor should increment the refcount of the > local_ref_count. > > > [liucm] In above data structure, which one or part is used by server core? > Which one or part is used by other cores? I guess root_pointer point to the > shared data structure which is only one copy in server core, and local ref_count > is each core's local variable, right? > > i think you can refer to the implementation of shared_ptr<> and lw_shared_ptr<> > in seastar/core/shared_ptr.hh, and foreign_ptr<> in seastar/core/sharded.hh. > actually, lw_shared_ptr<> is basically what we need when implementing > SharedLRU<> in Ceph. what is missing in seastar's weak_ptr<> and > lw_shared_ptr<>/shared_ptr<>, is the ability to construct a weak_ptr<> from a > shared_ptr<>, and to promote a weak_ptr<> to shared_ptr<>. and foreign_ptr<> > is what we need to share a given osdmap from its owner core to non-owner > cores. my plan is to re-implement a seastar variant of std::shared_ptr<> and > std::week_ptr<>. so they are more light-weighted than their standard library > counterparts in that they will use plain machine word integer for refcounting > instead of using the atomic types. > > if RCU is not as performant as we expect, we can also apply the foreign_ptr<> > machinery to config, if we want to keep a single copy of config in OSD. to be > specific, > > 0. the owner core caches a map of settings: Config, it returns a > shared_ptr<ConfigProxy> upon request for config from any of the fibers. and > keep track of this shared_ptr<ConfigProxy> using a week_ptr. if the the > ConfigProxy is destroyed, we should create a new instance of it upon request. > please note, we assume that a shared_ptr<> can be constructed using > weak_ptr<>. this ability is not offered by seastar's shared_ptr() at this moment. > 1. all non-owner fibers can only update settings using submit_to() call to the > owner core 2. all fibers on the owner core trying to update settings should wait > on a seastar condition_variable if the weak_ptr<> is tracking some ConfigProxy, > which will be signaled when the ConfigProxy is destroyed. > 3. local consumers of Config should access it via shared_ptr<ConfigProxy> 4. > foreign consumers of Config should access it via > foreign_ptr<shared_ptr<ConfigProxy>> > > > -thanks! > > > > All names subject to change for better ones, of course. > > Another thought (I really don't know how these costs work out) is that > > when we drop the sharded_shared_pointer_owner local_ref_count to zero, > > is that we pass a message to the owner thread instead of directly > > manipulating the parent->ref_count atomic. It's hard to have a good > > intuition for those costs, and I know I don't! (The nice part about > > using pointer structures instead of direct access throughout the code > > is that it's of course easy to change the cross-core implementation as > > we experiment and get new data.) -Greg > > > > -- > Regards > Kefu Chai ��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f