Re: seastar crimson --- pglock solution discussion

Gregory Farnum <gfarnum@xxxxxxxxxx> · Thu, 20 Dec 2018 15:08:31 -0800

On Thu, Dec 20, 2018 at 2:59 PM Liu, Chunmei <chunmei.liu@xxxxxxxxx> wrote:
>
>
>
> > -----Original Message-----
> > From: Gregory Farnum [mailto:gfarnum@xxxxxxxxxx]
> > Sent: Wednesday, December 19, 2018 3:15 PM
> > To: Liu, Chunmei <chunmei.liu@xxxxxxxxx>
> > Cc: The Esoteric Order of the Squid Cybernetic <ceph-devel@xxxxxxxxxxxxxxx>;
> > Kefu Chai <kchai@xxxxxxxxxx>; Cheng, Yingxin <yingxin.cheng@xxxxxxxxx>; Ma,
> > Jianpeng <jianpeng.ma@xxxxxxxxx>; Radoslaw Zarzynski <rzarzyns@xxxxxxxxxx>
> > Subject: Re: seastar crimson --- pglock solution discussion
> >
> > On Mon, Dec 17, 2018 at 4:23 PM Liu, Chunmei <chunmei.liu@xxxxxxxxx> wrote:
> > >
> > > Hi all,
> > >
> > >     In order to keep IO request sequence in one pg, osd use pglock to guarantee
> > the sequence.  Here in Crimson, it is lockless, so we use future/promise to do the
> > same work.
> > >
> > >     We can design Each PG has its own IO request queue in
> > > seastar-crimson shard. And each PG has one member seastar::promise<>
> > > pg_ready;
> > >
> > >    When need pglock.lock(),  we use the following logic to instead:
> > >
> > >                                     return pg_ready.get_future()
> > //after satisfy the pg_ready promise later then the future will be fulfilled here.
> > >                                        .then([this] {
> > >                                             Pg_ready = seastar::promise<>{};                           // set
> > promise pg_ready no future.
> > >                                                      Dequeue io from pg's request queue and do osd
> > following process.
> > >                                         });
> > >
> > >   When need pglock.unlock(), we use the following logic to instead:
> > >                                     then_wrapped([this] (auto fut) {
> > >                                        fut.forward_to(std::move(pg_ready));     // satisfy the
> > pg_ready promise
> > >                                     }); So the next IO request in the
> > > PG queue will not be dequeued until the pg_ready promise is satisfied after the
> > prior request has already been processed in OSD.
> > >
> > > Do you think it is workable?
> >
> > Have we considered *not* using a "global" pglock and instead tracking
> > dependencies more carefully?
> >
> > IIRC, in the current model we use the pg lock for two different kinds of things
> > 1) to prevent mutating in-memory state incorrectly across racing threads,
> > 2) to provide ordering of certain kinds of operations (eg, reads of in-progress
> > writes)
>
> Another 3) pglog need to be sequenced, first IO request pglog should write first, for replicators consistency. (use pglog head/tail pointer to do recovery)

Right.

> > In Seastar, we shouldn't need to worry about (1) at all.
>
> Yes, that is correct. Since each pg only belong to one seastar thread.
>
> > (2) is of course more tricky, but it seems like we ought to be able to do tracking
> > more easily so as to condition dependencies explicitly on the dependency. For
> > instance, we can condition a write operation being applied to the object store
> > on its preceding pg log operation being done; we can condition reads
> > proceeding on not having a write to the same object in progress, etc.
> >
> How to do the condition in crimson?  Can you give an example here?
>
> (3) Since Crimson code run in async mode, how to grantee pglog write in sequence?

I haven't worked with the Crimson code directly, but I assume we'd
have some kind of sequencer, and that there are pre-existing futures
around the operations being completed or stored on disk.

So couldn't we get those futures back when getting a pglog, and
condition our own steps on those being done at the right points? Or
would that be too expensive to track?
-Greg