Hi Takeshi, I'm trying to repeat your problem at https://github.com/ceph/ceph-qa-suite/pull/445. To be continued :-) Cheers On 26/05/2015 04:39, Miyamae, Takeshi wrote: > Hi Loic, > > We rebased our teuthology/ceph-qa-suite and retried the test toward LRC on current master. > However, we unfortunately got the same result as before (timeout error). > > [test conditions] > Target : Ceph-9.0.0-971-gd49d816 > https://github.com/kawaguchi-s/teuthology > https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886-lrc > > [teuthology log] > > 2015-05-25 10:18:23 # start > > 2015-05-25 11:59:52,106.106 INFO:teuthology.orchestra.run.RX35-1:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status -- format=json-pretty' > 2015-05-25 11:59:52,564.564 INFO:tasks.ceph.ceph_manager:no progress seen, keeping timeout for now > 2015-05-25 11:59:52,565.565 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last): > File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 635, in wrapper > return func(self) > File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 668, in do_thrash > timeout=self.config.get('timeout') > File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1569, in wait_for_recovery > 'failed to recover before timeout expired' > AssertionError: failed to recover before timeout expired > > Traceback (most recent call last): > File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run > result = self._run(*self.args, **self.kwargs) > File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 635, in wrapper > return func(self) > File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 668, in do_thrash > timeout=self.config.get('timeout') > File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1569, in wait_for_recovery > 'failed to recover before timeout expired' > AssertionError: failed to recover before timeout expired <Greenlet at 0x36cacd0: <bound method Thrasher.do_thrash of <tasks.ceph_manager.Thrasher instance at 0x36df3f8>>> failed with AssertionError > > Best regards, > Takeshi Miyamae > > -----Original Message----- > From: Loic Dachary [mailto:loic@xxxxxxxxxxx] > Sent: Thursday, May 21, 2015 6:38 PM > To: Miyamae, Takeshi/宮前 剛; Ceph Development > Cc: Kawaguchi, Shotaro/川口 翔太朗; Imai, Hiroki/今井 宏樹; Nakao, Takanori/中尾 鷹詔; Shiozawa, Kensuke/塩沢 賢輔 > Subject: Re: teuthology timeout error > > Hi, > > [sorry the previous mail was sent by accident, here is the full mail] > > On 21/05/2015 10:32, Miyamae, Takeshi wrote: >> Hi Loic, >> >>> Could you please share the teuthology/ceph-qa-suite repository you >>> are using to run these tests so I can try to reproduce / diagnose the problem ? >> >> https://github.com/kawaguchi-s/teuthology/tree/wip-10886 >> https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886 >> > > When compared against master they show differences that indicate it would be good to rebase: > > https://github.com/ceph/teuthology/compare/master...kawaguchi-s:wip-10886 > https://github.com/ceph/ceph-qa-suite/compare/master...kawaguchi-s:wip-10886 > > I think the teuthology commit on top of wip-10886 is a mistake > > https://github.com/kawaguchi-s/teuthology/commit/348e54931f89c9b0ae7a84eb931576f8414017b5 > > do you really need to modify teuthology ? It should just be necessary to use the latest master branch. > > It looks like the > > https://github.com/kawaguchi-s/ceph-qa-suite/commit/f2e3ca5d12ceef742eae2a9cf4057c436e9040c3 > > commit in your ceph-qa-suite is not what you intended. However > > https://github.com/kawaguchi-s/ceph-qa-suite/commit/4b39d6d4862f9091a849d224e880795be406815d > https://github.com/kawaguchi-s/ceph-qa-suite/commit/d16b4b058ae118931928541a2c8acd68f9703a44 > > look ok :-) Instead of naming the test 4nodes16osds3mons1client.yaml it would be better to use the same kind of naming you see at https://github.com/ceph/ceph-qa-suite/tree/master/suites/rados/thrash-erasure-code/workloads. That is a file name made of the distinctive parameters for the shec plugin (the parameters that are the default can be omited). > > Cheers > >> Here are our teuthology/ceph-qa-suite repositories. Thanks in advance. >> >> Best regards, >> Takeshi Miyamae >> >> -----Original Message----- >> From: Loic Dachary [mailto:loic@xxxxxxxxxxx] >> Sent: Wednesday, May 20, 2015 4:49 PM >> To: Miyamae, Takeshi/宮前 剛; Ceph Development >> Cc: Kawaguchi, Shotaro/川口 翔太朗; Imai, Hiroki/今井 宏樹; Nakao, Takanori/中尾 >> 鷹詔; Shiozawa, Kensuke/塩沢 賢輔 >> Subject: Re: teuthology timeout error >> >> Hi, >> >> On 20/05/2015 04:20, Miyamae, Takeshi wrote: >>> Hi Loic, >>> >>> When we fixed our own issue and restarted teuthology, >> >> Great ! >> >>> we encountered another issue (timeout error) which occurs in case of LRC as well. >>> Do you have any information about that ? >> >> Could you please share the teuthology/ceph-qa-suite repository you are using to run these tests so I can try to reproduce / diagnose the problem ? >> >> Thanks >> >>> >>> [error messages (in case of LRC pool)] >>> >>> 2015-04-28 12:38:54,128.128 INFO:teuthology.orchestra.run.RX35-1:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status --format=json-pretty' >>> 2015-04-28 12:38:54,516.516 INFO:tasks.ceph.ceph_manager:no progress >>> seen, keeping timeout for now >>> 2015-04-28 12:38:54,516.516 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last): >>> File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper >>> return func(self) >>> File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash >>> timeout=self.config.get('timeout') >>> File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery >>> 'failed to recover before timeout expired' >>> AssertionError: failed to recover before timeout expired >>> >>> Traceback (most recent call last): >>> File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run >>> result = self._run(*self.args, **self.kwargs) >>> File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper >>> return func(self) >>> File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash >>> timeout=self.config.get('timeout') >>> File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery >>> 'failed to recover before timeout expired' >>> AssertionError: failed to recover before timeout expired <Greenlet at >>> 0x2a7d550: <bound method Thrasher.do_thrash of >>> <tasks.ceph_manager.Thrasher instance at 0x2bd12d8>>> failed with >>> AssertionError >>> >>> [ceph version] >>> 0.93-952-gfe28daa >>> >>> [teuthology, ceph-qa-suite] >>> newest version at 3/25/2015 >>> >>> [configurations] >>> check-locks: false >>> overrides: >>> ceph: >>> conf: >>> global: >>> ms inject socket failures: 5000 >>> osd: >>> osd heartbeat use min delay socket: true >>> osd sloppy crc: true >>> fs: xfs >>> roles: >>> - - mon.a >>> - osd.0 >>> - osd.4 >>> - osd.8 >>> - osd.12 >>> - - mon.b >>> - osd.1 >>> - osd.5 >>> - osd.9 >>> - osd.13 >>> - - mon.c >>> - osd.2 >>> - osd.6 >>> - osd.10 >>> - osd.14 >>> - - osd.3 >>> - osd.7 >>> - osd.11 >>> - osd.15 >>> - client.0 >>> targets: >>> ubuntu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx: >>> ubuntu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx: >>> ubuntu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx: >>> ubuntu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx: >>> tasks: >>> - ceph: >>> conf: >>> osd: >>> osd debug reject backfill probability: 0.3 >>> osd max backfills: 1 >>> osd scrub max interval: 120 >>> osd scrub min interval: 60 >>> log-whitelist: >>> - wrongly marked me down >>> - objects unfound and apparently lost >>> - thrashosds: >>> chance_pgnum_grow: 1 >>> chance_pgpnum_fix: 1 >>> min_in: 4 >>> timeout: 1200 >>> - rados: >>> clients: >>> - client.0 >>> ec_pool: true >>> erasure_code_profile: >>> k: 4 >>> l: 3 >>> m: 2 >>> name: lrcprofile >>> plugin: lrc >>> ruleset-failure-domain: osd >>> objects: 50 >>> op_weights: >>> append: 100 >>> copy_from: 50 >>> delete: 50 >>> read: 100 >>> rmattr: 25 >>> rollback: 50 >>> setattr: 25 >>> snap_create: 50 >>> snap_remove: 50 >>> write: 0 >>> ops: 190000 >>> >>> Best regards, >>> Takeshi Miyamae >>> >> > > -- > Loïc Dachary, Artisan Logiciel Libre > > > -- Loïc Dachary, Artisan Logiciel Libre
Attachment:
signature.asc
Description: OpenPGP digital signature