Re: using RCU to replace Locker in config for seastar version

Gregory Farnum <gfarnum@xxxxxxxxxx> · Sun, 10 Jun 2018 10:58:29 -0700

On Fri, Jun 8, 2018 at 5:29 PM, Liu, Chunmei <chunmei.liu@xxxxxxxxx> wrote:
> Hi Greg,
>
>    How to use message-passing? each core maintain a local replication copy of data structure and use message-passing to inform other cores update its own local copy.   Or only one core can access data structure, the other cores should get shared data structure through this core?

Just as a first pass, in the case of the config structure it might be
something like:
1) Create new config struct in memory on "server" core
2) Use the "sharded_shared_ptr" I'll discuss below to give each core a
reference to it
3) Send a message to the cores telling them this has happened
4) At a later time, clean up the previous config structure when all
cores drop their refcounts to zero.

Now, that looks an awful lot like RCU, which makes sense since it's a
useful basic algorithm. But we're avoiding trying to properly track
accesses via a library like liburcu that's been referenced. I like
that both because it limits the number paradigms a Ceph developer
needs to be able to work with, and also because we've prototyped using
liburcu before and found it made things *slower*.
We can do something similar for the osd_map_cache, where local threads
keep their own map of epochs to pointers, with local integer ref
counts, and drop the global atomic count when the thread drops all
users.

On Sat, Jun 9, 2018 at 12:16 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
>> > > When I went through the data structures, it generally seemed like
>> > > message-passing about data structure changes would be a better way to
>> > > go than trying to employ any kind of real RCU library (or even the
>> > > exact abstractions). We might maintain local pointers to constant
>> > > structures with a per-core ref count to protect deletion, but proper
>
> Is there already a per-core ref-counting foo_ptr<> that does this?  (This
> being a core/thread-local refcount, and a global atomic refcount?)  This
> seems useful in lots of places (probably most places we use
> RefCountedObject now... things like OSDSession).

Yeah, I don't think anything like this exists. But it'll be a useful
tool, *especially* when we start mixing in posix threads.

Just to be clear, I'm thinking something like:

class sharded_shared_pointer_owner<T> {
  int local_ref_count;
  root_pointer<T> {
    atomic_t ref_count;
    T *object;
  }
  root_pointer<T> *parent;
}

class sharded_shared_pointer<T> {
  sharded_shared_pointer_owner *parent;
}

Where copying the sharded_shared_pointer increments the
local_ref_count, and the sharded_shared_pointer_owner is used on
copying between threads and increments the root_pointer::ref_count.
All names subject to change for better ones, of course.
Another thought (I really don't know how these costs work out) is that
when we drop the sharded_shared_pointer_owner local_ref_count to zero,
is that we pass a message to the owner thread instead of directly
manipulating the parent->ref_count atomic. It's hard to have a good
intuition for those costs, and I know I don't! (The nice part about
using pointer structures instead of direct access throughout the code
is that it's of course easy to change the cross-core implementation as
we experiment and get new data.)
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html