On Tue, Feb 12, 2019 at 6:50 AM kefu chai <tchaikov@xxxxxxxxx> wrote: > > Chunmei, > > to continue the discussion in the last crimson standup, i am noting > down some of our findings when reading Radoslaw's > https://github.com/ceph/ceph/pull/24962. > > there are multiple places that we need to put the request on hold > until the unmet precondition is satisfied. as noted by Yingxin that it > is always more efficient to enqueue a request in application's own > queue than capturing it in a continuation and stash it in the task > queue of reactor. Do we know how much more efficient it is? I ask because maintaining these queues is one of the buggier systems in the existing OSD — we can get them very stable through extensive testing but any changes tend to take a while to flush things out. I was very much looking forward to just making all of those status checks a per-op future/promise rather than having to do checks, shuffle them aside, and then do more complicated checks about the existence of queues on the other ops. :( -Greg > the downside of the pending request queue is that > > - apparently we need to maintain a queue for each precondition that > could hold the requests, see the `waiting_for_*` lists/maps in PG.h. > but we also need to keep track of pending futures if we want to chain > the maybe_wait_*() as https://pad.ceph.com/p/crimson-io-path puts. and > the pending futures are likely to be structured in very the same way > of these `waiting_for_` lists, the only difference is that the value > of these containers would be futures. > - the op will need to go through the same checks once its precondition > is satisfied and it is enqueued again. probably we need to check if > there are any preconditions that implying other preconditions. if yes, > is it plausible/worthy to /continue/ performing this request instead > of redo all the checks ? can we reorder some of the checks for better > performance? or for better readability? > > if we need to redo all the checks like we are doing in existing > ceph-osd, we can either > > - use a grand seastar::repeat() to redo a request until we run into > some exception or the request is served. or > - use a queue for tracking the pending requests, and rerun them in the > fiber that fulfill the precondition. for instance, if a batch of > requests are waiting for an updated osdmap, after consuming the > updated osdmap, the PG will need to serve all of the requests that are > waiting for it. > > what do you think? > > -- > Regards > Kefu Chai