RE: teuthology timeout error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Loic,

> Could you please share the teuthology/ceph-qa-suite repository you are using to run these tests
> so I can try to reproduce / diagnose the problem ?

https://github.com/kawaguchi-s/teuthology/tree/wip-10886
https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886

Here are our teuthology/ceph-qa-suite repositories. Thanks in advance.

Best regards,
Takeshi Miyamae

-----Original Message-----
From: Loic Dachary [mailto:loic@xxxxxxxxxxx] 
Sent: Wednesday, May 20, 2015 4:49 PM
To: Miyamae, Takeshi/宮前 剛; Ceph Development
Cc: Kawaguchi, Shotaro/川口 翔太朗; Imai, Hiroki/今井 宏樹; Nakao, Takanori/中尾 鷹詔; Shiozawa, Kensuke/塩沢 賢輔
Subject: Re: teuthology timeout error

Hi,

On 20/05/2015 04:20, Miyamae, Takeshi wrote:
> Hi Loic,
> 
> When we fixed our own issue and restarted teuthology, 

Great !

> we encountered another issue (timeout error) which occurs in case of LRC as well.
> Do you have any information about that ?

Could you please share the teuthology/ceph-qa-suite repository you are using to run these tests so I can try to reproduce / diagnose the problem ?

Thanks

> 
> [error messages (in case of LRC pool)]
> 
> 2015-04-28 12:38:54,128.128 INFO:teuthology.orchestra.run.RX35-1:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status --format=json-pretty'
> 2015-04-28 12:38:54,516.516 INFO:tasks.ceph.ceph_manager:no progress seen, keeping timeout for now
> 2015-04-28 12:38:54,516.516 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>     return func(self)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>     timeout=self.config.get('timeout')
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>     'failed to recover before timeout expired'
> AssertionError: failed to recover before timeout expired
> 
> Traceback (most recent call last):
>   File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
>     result = self._run(*self.args, **self.kwargs)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>     return func(self)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>     timeout=self.config.get('timeout')
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>     'failed to recover before timeout expired'
> AssertionError: failed to recover before timeout expired <Greenlet at 0x2a7d550: <bound method Thrasher.do_thrash of <tasks.ceph_manager.Thrasher instance at 0x2bd12d8>>> failed with AssertionError
> 
> [ceph version]
> 0.93-952-gfe28daa
> 
> [teuthology, ceph-qa-suite]
> newest version at 3/25/2015
> 
> [configurations]
>   check-locks: false
>   overrides:
>     ceph:
>       conf:
>         global:
>           ms inject socket failures: 5000
>         osd:
>           osd heartbeat use min delay socket: true
>           osd sloppy crc: true
>       fs: xfs
>   roles:
>   - - mon.a
>     - osd.0
>     - osd.4
>     - osd.8
>     - osd.12
>   - - mon.b
>     - osd.1
>     - osd.5
>     - osd.9
>     - osd.13
>   - - mon.c
>     - osd.2
>     - osd.6
>     - osd.10
>     - osd.14
>   - - osd.3
>     - osd.7
>     - osd.11
>     - osd.15
>     - client.0
>   targets:
>     ubuntu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:
>     ubuntu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:
>     ubuntu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:
>     ubuntu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:
>   tasks:
>   - ceph:
>       conf:
>         osd:
>           osd debug reject backfill probability: 0.3
>           osd max backfills: 1
>           osd scrub max interval: 120
>           osd scrub min interval: 60
>       log-whitelist:
>       - wrongly marked me down
>       - objects unfound and apparently lost
>   - thrashosds:
>       chance_pgnum_grow: 1
>       chance_pgpnum_fix: 1
>       min_in: 4
>       timeout: 1200
>   - rados:
>       clients:
>       - client.0
>       ec_pool: true
>       erasure_code_profile:
>         k: 4
>         l: 3
>         m: 2
>         name: lrcprofile
>         plugin: lrc
>         ruleset-failure-domain: osd
>       objects: 50
>       op_weights:
>         append: 100
>         copy_from: 50
>         delete: 50
>         read: 100
>         rmattr: 25
>         rollback: 50
>         setattr: 25
>         snap_create: 50
>         snap_remove: 50
>         write: 0
>       ops: 190000
> 
> Best regards,
> Takeshi Miyamae
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f





[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux