Re: Swift tests failing randomly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 11/08/2014 19:34, Yuri Weinstein wrote:
> Here is what we have in vps.yaml now:
> 
> overrides:
>   ceph:
>     conf:
>       global:
>         osd heartbeat grace: 40
> 
> What do we want to add?

I think the idle_timeout values at

https://github.com/ceph/ceph-qa-suite/pull/79/files


> 
> ~
> 
> On Mon, Aug 11, 2014 at 10:13 AM, Sage Weil <sweil@xxxxxxxxxx> wrote:
>> On Mon, 11 Aug 2014, Yehuda Sadeh wrote:
>>> Yeah, looking at these logs, it really seem that it's just that things
>>> are going slow on these machines and it's hitting timeouts. The fix is
>>> ok with me, although I'd rather have it adjusted per machine type
>>> (somehow).
>>
>> There is a vps.yaml that bumps up another timeout, so we could put it
>> there.  Right now it lives on the teuthology machine
>> (~teuthworker/vps.yaml I think?), but perhaps we should stick it in
>> ceph-qa-suite.git somewhere ...
>>
>> sage
>>
>>>
>>> Yehuda
>>>
>>> On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary <loic@xxxxxxxxxxx> wrote:
>>>> Hi Yehuda,
>>>>
>>>> It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ?
>>>>
>>>> Cheers
>>>>
>>>> On 10/08/2014 10:46, Loic Dachary wrote:
>>>>> Hi Yehuda,
>>>>>
>>>>> In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples:
>>>>>
>>>>>     http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944
>>>>>     http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941
>>>>>     http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946
>>>>>     http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947
>>>>>
>>>>> and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-)
>>>>>
>>>>> You can find more context in
>>>>>
>>>>> http://tracker.ceph.com/issues/8988
>>>>> http://tracker.ceph.com/issues/8016
>>>>> http://tracker.ceph.com/issues/7799
>>>>>
>>>>> and discussions at
>>>>>
>>>>> http://www.spinics.net/lists/ceph-devel/msg19933.html
>>>>>
>>>>> Cheers
>>>>>
>>>>
>>>> --
>>>> Lo?c Dachary, Artisan Logiciel Libre
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux