Re: Could be due to a race condition... Re: Help with understanding throttle.finish()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 13-5-2016 00:22, Samuel Just wrote:
> That sounds reasonable.

And thus submit a a pull, and go thru the full QA process?

--WjW

> -Sam
> 
> On Thu, May 12, 2016 at 2:02 PM, Willem Jan Withagen <wjw@xxxxxxxxxxx> wrote:
>> On 11-5-2016 15:43, Gregory Farnum wrote:
>>>
>>> I've been busy lately. Looks like Sam wrote the BackoffThrottle; not
>>> sure if he did the tests as well or not. Maybe check the git logs on
>>> them. :)
>>
>>
>> Well, the test is also written bij Sam.
>>
>> But everything gets stuck in the loop at BackoffThrottle::get(uint64_t
>> c):379 :
>> ====
>>   while (((start + delay) > std::chrono::system_clock::now()) ||
>>          !((max == 0) || (current == 0) || ((current + c) <= max))) {
>>     assert(ticket == waiters.begin());
>>     (*ticket)->wait_until(l, start + delay);
>>     delay = _get_delay(c);
>>   }
>> ====
>>
>> And I have the feeling that the lock is not released (long enough) when
>> start+delay is actually in the past.
>> It is of course also doubtfull what is actually expected from the
>> wait_until function in this case???
>>
>> Replacing it with
>>         (*ticket)->wait_for(l, delay);
>> Does get the BackoffThrottle.oversaturated test completed.
>>
>> So would this be a appropriate modification?
>>
>> --WjW
>>
>>
>>> -Greg
>>>
>>> On Wed, May 11, 2016 at 2:46 AM, Willem Jan Withagen <wjw@xxxxxxxxxxx>
>>> wrote:
>>>>
>>>> On 9-5-2016 10:44, Willem Jan Withagen wrote:
>>>>>
>>>>>
>>>>> On 6-5-2016 23:04, Willem Jan Withagen wrote:
>>>>>>
>>>>>>
>>>>>> On 6-5-2016 20:41, Gregory Farnum wrote:
>>>>
>>>>
>>>>
>>>> Hi Greg,
>>>>
>>>> If you are not the one to ask about the Throttle stuff, would you know
>>>> anybody else I could ask this question.
>>>> Because the more runs I make, the more it seems like it is a livelock.
>>>>
>>>> I'm now going to do the annotated runs on Linux as well, and try and
>>>> find out why it doesn't happen there.
>>>>
>>>> --WjW
>>>>
>>>>> Oke,
>>>>>
>>>>> I'm going to need some help understanding some things...
>>>>>
>>>>> I've taken the oversaturated test_backoff from test/common/Throttle.cc,
>>>>> since that one wasn't working as well. And I suspect that it might be
>>>>> for about the same/similar reason.
>>>>>
>>>>> I start 1 putter and one getter, and with certain parameters the test
>>>>> does not complete. When
>>>>>     expected_throughput * put_delay_per_count ~> 10 (max_multiple??)
>>>>> then the process get into a loop, I'm printing over and over:
>>>>> _get_delay return r > high_threshhold (queue is full): time = 0.0172
>>>>>
>>>>> Note that the parameters to test_backoff are:
>>>>> ====
>>>>> //  double expected_throughput,
>>>>>     500,
>>>>> //  double max_multiple,
>>>>>     10,
>>>>> //  double put_delay_per_count,
>>>>>     0.025,
>>>>> ====
>>>>>
>>>>> And that loop is in BackoffThrottle::get, around line 390 (I have a lot
>>>>> of extra lines for annotation and tracing.)
>>>>> ====
>>>>>   while (((start + delay) > std::chrono::system_clock::now()) ||
>>>>>          !((max == 0) || (current == 0) || ((current + c) <= max))) {
>>>>>     assert(ticket == waiters.begin());
>>>>>     (*ticket)->wait_until(l, start + delay);
>>>>>     delay = _get_delay(c);
>>>>>   }
>>>>> ====
>>>>>
>>>>> The reason that this loop never completes is that it executed under the
>>>>> lock in the getter thread in test/common/Throttle.cc:test_backoff where
>>>>> it is executed under g.lock(). And the putter is held on the same lock,
>>>>> so nothing gets in or out.... Not a deadlock, but a live-lock.
>>>>>
>>>>> Thusfar the analysis.
>>>>>
>>>>> Now the question(s)....
>>>>> 1) How do I break this livelock.
>>>>> 1a) Should the extra while() { wait } loop actually be in
>>>>> BackoffThrottle:get without releasing the lock?
>>>>>
>>>>> 2) The thread are called getter and putter.
>>>>> So I'd expect the getter to get messages from the list and the putter so
>>>>> write things to the list.
>>>>>
>>>>> Why does the getter:
>>>>>         while(!stop) {
>>>>>             throttle.get(to_get);
>>>>>             in_queue.push_back(to_get)
>>>>>             assert(total <= max);
>>>>>         }
>>>>> And the putter:
>>>>>         while(!stop) {
>>>>>             while(empty)
>>>>>                 wait;
>>>>>             c = in_queue.front();
>>>>>             in_queue.pop();
>>>>>             "wait time to queue some more"
>>>>>             throttle.put(c)
>>>>>         }
>>>>>
>>>>> Am I getting it wrong around, and would I not expect the
>>>>> throttle.{get,put} to match the similar operations on the in_queue?
>>>>>
>>>>> And what is this c value that we retrieve from the in_queue that is then
>>>>> used to be put in the throttle???
>>>>>
>>>>> --WjW
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux