Re: seastar crimson --- pglock solution discussion

Gregory Farnum <gfarnum@xxxxxxxxxx> · Wed, 19 Dec 2018 15:15:18 -0800

On Mon, Dec 17, 2018 at 4:23 PM Liu, Chunmei <chunmei.liu@xxxxxxxxx> wrote:
>
> Hi all,
>
>     In order to keep IO request sequence in one pg, osd use pglock to guarantee the sequence.  Here in Crimson, it is lockless, so we use future/promise to do the same work.
>
>     We can design Each PG has its own IO request queue in seastar-crimson shard. And each PG has one member seastar::promise<> pg_ready;
>
>    When need pglock.lock(),  we use the following logic to instead:
>
>                                     return pg_ready.get_future()                                            //after satisfy the pg_ready promise later then the future will be fulfilled here.
>                                        .then([this] {
>                                             Pg_ready = seastar::promise<>{};                           // set promise pg_ready no future.
>                                                      Dequeue io from pg's request queue and do osd following process.
>                                         });
>
>   When need pglock.unlock(), we use the following logic to instead:
>                                     then_wrapped([this] (auto fut) {
>                                        fut.forward_to(std::move(pg_ready));     // satisfy the pg_ready promise
>                                     });
> So the next IO request in the PG queue will not be dequeued until the pg_ready promise is satisfied after the prior request has already been processed in OSD.
>
> Do you think it is workable?

Have we considered *not* using a "global" pglock and instead tracking
dependencies more carefully?

IIRC, in the current model we use the pg lock for two different kinds of things
1) to prevent mutating in-memory state incorrectly across racing threads,
2) to provide ordering of certain kinds of operations (eg, reads of
in-progress writes)

In Seastar, we shouldn't need to worry about (1) at all.
(2) is of course more tricky, but it seems like we ought to be able to
do tracking more easily so as to condition dependencies explicitly on
the dependency. For instance, we can condition a write operation being
applied to the object store on its preceding pg log operation being
done; we can condition reads proceeding on not having a write to the
same object in progress, etc.

Am I misunderstanding something? Or does this make sense?
-Greg