> -----Original Message----- > From: Gregory Farnum [mailto:gfarnum@xxxxxxxxxx] > Sent: Wednesday, December 19, 2018 3:15 PM > To: Liu, Chunmei <chunmei.liu@xxxxxxxxx> > Cc: The Esoteric Order of the Squid Cybernetic <ceph-devel@xxxxxxxxxxxxxxx>; > Kefu Chai <kchai@xxxxxxxxxx>; Cheng, Yingxin <yingxin.cheng@xxxxxxxxx>; Ma, > Jianpeng <jianpeng.ma@xxxxxxxxx>; Radoslaw Zarzynski <rzarzyns@xxxxxxxxxx> > Subject: Re: seastar crimson --- pglock solution discussion > > On Mon, Dec 17, 2018 at 4:23 PM Liu, Chunmei <chunmei.liu@xxxxxxxxx> wrote: > > > > Hi all, > > > > In order to keep IO request sequence in one pg, osd use pglock to guarantee > the sequence. Here in Crimson, it is lockless, so we use future/promise to do the > same work. > > > > We can design Each PG has its own IO request queue in > > seastar-crimson shard. And each PG has one member seastar::promise<> > > pg_ready; > > > > When need pglock.lock(), we use the following logic to instead: > > > > return pg_ready.get_future() > //after satisfy the pg_ready promise later then the future will be fulfilled here. > > .then([this] { > > Pg_ready = seastar::promise<>{}; // set > promise pg_ready no future. > > Dequeue io from pg's request queue and do osd > following process. > > }); > > > > When need pglock.unlock(), we use the following logic to instead: > > then_wrapped([this] (auto fut) { > > fut.forward_to(std::move(pg_ready)); // satisfy the > pg_ready promise > > }); So the next IO request in the > > PG queue will not be dequeued until the pg_ready promise is satisfied after the > prior request has already been processed in OSD. > > > > Do you think it is workable? > > Have we considered *not* using a "global" pglock and instead tracking > dependencies more carefully? > > IIRC, in the current model we use the pg lock for two different kinds of things > 1) to prevent mutating in-memory state incorrectly across racing threads, > 2) to provide ordering of certain kinds of operations (eg, reads of in-progress > writes) Another 3) pglog need to be sequenced, first IO request pglog should write first, for replicators consistency. (use pglog head/tail pointer to do recovery) > > In Seastar, we shouldn't need to worry about (1) at all. Yes, that is correct. Since each pg only belong to one seastar thread. > (2) is of course more tricky, but it seems like we ought to be able to do tracking > more easily so as to condition dependencies explicitly on the dependency. For > instance, we can condition a write operation being applied to the object store > on its preceding pg log operation being done; we can condition reads > proceeding on not having a write to the same object in progress, etc. > How to do the condition in crimson? Can you give an example here? (3) Since Crimson code run in async mode, how to grantee pglog write in sequence? > Am I misunderstanding something? Or does this make sense? > -Greg