On 13-5-2016 00:22, Samuel Just wrote: > That sounds reasonable. And thus submit a a pull, and go thru the full QA process? --WjW > -Sam > > On Thu, May 12, 2016 at 2:02 PM, Willem Jan Withagen <wjw@xxxxxxxxxxx> wrote: >> On 11-5-2016 15:43, Gregory Farnum wrote: >>> >>> I've been busy lately. Looks like Sam wrote the BackoffThrottle; not >>> sure if he did the tests as well or not. Maybe check the git logs on >>> them. :) >> >> >> Well, the test is also written bij Sam. >> >> But everything gets stuck in the loop at BackoffThrottle::get(uint64_t >> c):379 : >> ==== >> while (((start + delay) > std::chrono::system_clock::now()) || >> !((max == 0) || (current == 0) || ((current + c) <= max))) { >> assert(ticket == waiters.begin()); >> (*ticket)->wait_until(l, start + delay); >> delay = _get_delay(c); >> } >> ==== >> >> And I have the feeling that the lock is not released (long enough) when >> start+delay is actually in the past. >> It is of course also doubtfull what is actually expected from the >> wait_until function in this case??? >> >> Replacing it with >> (*ticket)->wait_for(l, delay); >> Does get the BackoffThrottle.oversaturated test completed. >> >> So would this be a appropriate modification? >> >> --WjW >> >> >>> -Greg >>> >>> On Wed, May 11, 2016 at 2:46 AM, Willem Jan Withagen <wjw@xxxxxxxxxxx> >>> wrote: >>>> >>>> On 9-5-2016 10:44, Willem Jan Withagen wrote: >>>>> >>>>> >>>>> On 6-5-2016 23:04, Willem Jan Withagen wrote: >>>>>> >>>>>> >>>>>> On 6-5-2016 20:41, Gregory Farnum wrote: >>>> >>>> >>>> >>>> Hi Greg, >>>> >>>> If you are not the one to ask about the Throttle stuff, would you know >>>> anybody else I could ask this question. >>>> Because the more runs I make, the more it seems like it is a livelock. >>>> >>>> I'm now going to do the annotated runs on Linux as well, and try and >>>> find out why it doesn't happen there. >>>> >>>> --WjW >>>> >>>>> Oke, >>>>> >>>>> I'm going to need some help understanding some things... >>>>> >>>>> I've taken the oversaturated test_backoff from test/common/Throttle.cc, >>>>> since that one wasn't working as well. And I suspect that it might be >>>>> for about the same/similar reason. >>>>> >>>>> I start 1 putter and one getter, and with certain parameters the test >>>>> does not complete. When >>>>> expected_throughput * put_delay_per_count ~> 10 (max_multiple??) >>>>> then the process get into a loop, I'm printing over and over: >>>>> _get_delay return r > high_threshhold (queue is full): time = 0.0172 >>>>> >>>>> Note that the parameters to test_backoff are: >>>>> ==== >>>>> // double expected_throughput, >>>>> 500, >>>>> // double max_multiple, >>>>> 10, >>>>> // double put_delay_per_count, >>>>> 0.025, >>>>> ==== >>>>> >>>>> And that loop is in BackoffThrottle::get, around line 390 (I have a lot >>>>> of extra lines for annotation and tracing.) >>>>> ==== >>>>> while (((start + delay) > std::chrono::system_clock::now()) || >>>>> !((max == 0) || (current == 0) || ((current + c) <= max))) { >>>>> assert(ticket == waiters.begin()); >>>>> (*ticket)->wait_until(l, start + delay); >>>>> delay = _get_delay(c); >>>>> } >>>>> ==== >>>>> >>>>> The reason that this loop never completes is that it executed under the >>>>> lock in the getter thread in test/common/Throttle.cc:test_backoff where >>>>> it is executed under g.lock(). And the putter is held on the same lock, >>>>> so nothing gets in or out.... Not a deadlock, but a live-lock. >>>>> >>>>> Thusfar the analysis. >>>>> >>>>> Now the question(s).... >>>>> 1) How do I break this livelock. >>>>> 1a) Should the extra while() { wait } loop actually be in >>>>> BackoffThrottle:get without releasing the lock? >>>>> >>>>> 2) The thread are called getter and putter. >>>>> So I'd expect the getter to get messages from the list and the putter so >>>>> write things to the list. >>>>> >>>>> Why does the getter: >>>>> while(!stop) { >>>>> throttle.get(to_get); >>>>> in_queue.push_back(to_get) >>>>> assert(total <= max); >>>>> } >>>>> And the putter: >>>>> while(!stop) { >>>>> while(empty) >>>>> wait; >>>>> c = in_queue.front(); >>>>> in_queue.pop(); >>>>> "wait time to queue some more" >>>>> throttle.put(c) >>>>> } >>>>> >>>>> Am I getting it wrong around, and would I not expect the >>>>> throttle.{get,put} to match the similar operations on the in_queue? >>>>> >>>>> And what is this c value that we retrieve from the in_queue that is then >>>>> used to be put in the throttle??? >>>>> >>>>> --WjW >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html