On Sat, Jun 9, 2018 at 8:30 AM Liu, Chunmei <chunmei.liu@xxxxxxxxx> wrote: > > Hi Greg, > > How to use message-passing? each core maintain a local replication copy of data structure and use message-passing to inform other cores update its own local copy. Or only one core can access data structure, the other cores should get shared data structure through this core? > > Thanks! > -Chunmei > -----Original Message----- > From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Matt Benjamin > Sent: Friday, June 08, 2018 11:40 AM > To: Gregory Farnum <gfarnum@xxxxxxxxxx> > Cc: Kefu Chai <kchai@xxxxxxxxxx>; Sage Weil <sweil@xxxxxxxxxx>; Liu, Chunmei <chunmei.liu@xxxxxxxxx>; The Esoteric Order of the Squid Cybernetic <ceph-devel@xxxxxxxxxxxxxxx> > Subject: Re: using RCU to replace Locker in config for seastar version > > That's what I would have thought, yes. I thought the discussion was about RCU in the pthreaded codebase. Dan Lambright prototyped that for one of the maps with liburcu, a good while ago. > > Matt > > On Fri, Jun 8, 2018 at 2:35 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > Can anybody lay out a concrete use case for us employing real RCU in a > > Seastar OSD? i think we can use RCU to implement the concurrency of reading and updating config settings, as the config settings can be versioned, when a reader reads the setting it holds a version of the settings, when a writer is about to update a setting, it wait until all readers relinquish the old version of settings. if we go with the solution proposed by me in last CDM, we need to keep a full copy of all OSD related settings on each core. when the writer changes a setting, the core serving the MCommand will send messages to all cores to get their copies of settings updated. this model is simpler. but it is not spacial efficient. > > When I went through the data structures, it generally seemed like > > message-passing about data structure changes would be a better way to > > go than trying to employ any kind of real RCU library (or even the > > exact abstractions). We might maintain local pointers to constant > > structures with a per-core ref count to protect deletion, but proper i think this model works better with osdmap caching, as i think osdmap is not updated very frequently in a healthy cluster. so we can update/retrive the map using the message-passing machinery. and have a single copy maintained by a configured core. but settings are constantly read, the reference count of it on a certain core could be flipping between 1 and 0 all the time (sometime it could be greater than 1), so i don't think it's efficient to use the message-passing to maintain the config settings. > > RCU would involve unpredictable access to out-of-core memory locations > > (especially if we have multiple writers!), whereas if we stick with > > the message-passing that Seastar expects, we get all the optimizations > > that come from those very careful data structures. > > -Greg > > > > On Fri, Jun 8, 2018 at 4:41 AM, kefu chai <tchaikov@xxxxxxxxx> wrote: > >> On Fri, Jun 8, 2018 at 11:08 AM Sage Weil <sweil@xxxxxxxxxx> wrote: > >>> > >>> On Fri, 8 Jun 2018, Liu, Chunmei wrote: > >>> > Hi Kefu, > >>> > > >>> > For RCU, I get the following facts, 1. For readers, no block, > >>> > and support multiple readers concurrently. > >>> > 2. For writers, it is blocked until all readers go out of reader-side critical session. If there are multiple concurrent writers need spin lock to synchronize all the writers. > >>> > > >>> > Are there multi concurrent writers for config or OSDMap? > >>> > >>> Not really... and for config writes are rare. I suspect we could > >>> even get away with something that just allocates each new value on > >>> the heap and updates a pointer, and then never bothers to free the > >>> old values. (Or maybe frees them after a ridiculous amount of time > >>> has passed.) > >> > >> yeah, RCU supports the concurrency between a single writer and > >> multiple readers. in the case of config options, i think the > >> writer(s) would be the threads/reactors serving the MCommand sent > >> from osd clients. so in theory, there could be multiple writers. but > >> as Sage pointed out, writes are rare. i think we could use spin lock > >> to implement the exclusive lock. > >> > >>> > >>> For the OSDMap cache, we have a constant stream of new maps coming > >>> in and old maps getting pruned, so it's a bit trickier. > >> > >> yeah, both MonClient and peer OSDs update the OSD with new maps, and > >> OSD keep trimming unused maps from the cache. > >> > >>> > >>> > Do you think it is acceptable in seastar version since need spin > >>> > lock for multiple concurrent writers? > >> > >> i think it's fine if the write is relatively rare and what the spin > >> lock protects are very fast operations, like flipping a flag or > >> setting a pointer, etc. > >> > >>> > > >>> > Thanks! > >>> > -Chunmei > >>> > -- > >>> > To unsubscribe from this list: send the line "unsubscribe > >>> > ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx > >>> > More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> > > >>> > > >> > >> > >> > >> -- > >> Regards > >> Kefu Chai > >> -- > >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" > >> in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo > >> info at http://vger.kernel.org/majordomo-info.html > > -- > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo > > info at http://vger.kernel.org/majordomo-info.html > > > > -- > > Matt Benjamin > Red Hat, Inc. > 315 West Huron Street, Suite 140A > Ann Arbor, Michigan 48103 > > http://www.redhat.com/en/technologies/storage > > tel. 734-821-5101 > fax. 734-769-8938 > cel. 734-216-5309 > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html -- Regards Kefu Chai -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html