On 28/02/2015 16:47, Yuri Weinstein wrote: > Loic > > In case you want to add some comments - http://tracker.ceph.com/issues/10945 Done thanks ! > > Thx > YuriW > > ----- Original Message ----- > From: "Loic Dachary" <loic@xxxxxxxxxxx> > To: "Ceph Development" <ceph-devel@xxxxxxxxxxxxxxx> > Sent: Saturday, February 28, 2015 7:01:29 AM > Subject: Re: re-running teuthology jobs > > The simpler way is to use the --filter argument of teuthology-suite with the value of the description: field found in the config.yaml file. For instance, running the rados failed jobs http://tracker.ceph.com/issues/10641#rados failed jobs: > > $ ./virtualenv/bin/teuthology-suite --priority 101 --suite rados --filter 'rados/multimon/{clusters/21.yaml msgr-failures/many.yaml tasks/mon_clock_with_skews.yaml},rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/osd-delay.yaml thrashers/morepggrow.yaml workloads/small-objects.yaml},rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/osd-delay.yaml thrashers/pggrow.yaml workloads/ec-small-objects.yaml},rados/verify/{1thrash/none.yaml clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/few.yaml tasks/mon_recovery.yaml validater/valgrind.yaml},rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/osd-delay.yaml thrashers/default.yaml workloads/cache-agent-small.yaml}' --suite-branch firefly --machine-type plana,burnupi,mira --distro ubuntu --email loic@xxxxxxxxxxx --owner loic@xxxxxxxxxxx --ceph firefly-backports > 2015-02-28 15:58:08,474.474 INFO:teuthology.suite:ceph sha1: e54834bfac3c38562987730b317cb1944a96005b > 2015-02-28 15:58:08,969.969 INFO:teuthology.suite:ceph version: 0.80.8-75-ge54834b-1precise > 2015-02-28 15:58:09,606.606 INFO:teuthology.suite:teuthology branch: master > 2015-02-28 15:58:10,407.407 INFO:teuthology.suite:ceph-qa-suite branch: firefly > 2015-02-28 15:58:10,409.409 INFO:teuthology.repo_utils:Fetching from upstream into /home/loic/src/ceph-qa-suite_firefly > 2015-02-28 15:58:11,522.522 INFO:teuthology.repo_utils:Resetting repo at /home/loic/src/ceph-qa-suite_firefly to branch firefly > 2015-02-28 15:58:12,393.393 INFO:teuthology.suite:Suite rados in /home/loic/src/ceph-qa-suite_firefly/suites/rados generated 693 jobs (not yet filtered) > 2015-02-28 15:58:12,419.419 INFO:teuthology.suite:Scheduling rados/multimon/{clusters/21.yaml msgr-failures/many.yaml tasks/mon_clock_with_skews.yaml} > Job scheduled with name loic-2015-02-28_15:58:07-rados-firefly-backports---basic-multi and ID 783145 > 2015-02-28 15:58:14,199.199 INFO:teuthology.suite:Scheduling rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/osd-delay.yaml thrashers/default.yaml workloads/cache-agent-small.yaml} > Job scheduled with name loic-2015-02-28_15:58:07-rados-firefly-backports---basic-multi and ID 783146 > 2015-02-28 15:58:15,650.650 INFO:teuthology.suite:Scheduling rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/osd-delay.yaml thrashers/morepggrow.yaml workloads/small-objects.yaml} > Job scheduled with name loic-2015-02-28_15:58:07-rados-firefly-backports---basic-multi and ID 783147 > 2015-02-28 15:58:16,837.837 INFO:teuthology.suite:Scheduling rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/osd-delay.yaml thrashers/pggrow.yaml workloads/ec-small-objects.yaml} > Job scheduled with name loic-2015-02-28_15:58:07-rados-firefly-backports---basic-multi and ID 783148 > 2015-02-28 15:58:18,421.421 INFO:teuthology.suite:Scheduling rados/verify/{1thrash/none.yaml clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/few.yaml tasks/mon_recovery.yaml validater/valgrind.yaml} > Job scheduled with name loic-2015-02-28_15:58:07-rados-firefly-backports---basic-multi and ID 783149 > 2015-02-28 15:58:19,729.729 INFO:teuthology.suite:Suite rados in /home/loic/src/ceph-qa-suite_firefly/suites/rados scheduled 5 jobs. > 2015-02-28 15:58:19,729.729 INFO:teuthology.suite:Suite rados in /home/loic/src/ceph-qa-suite_firefly/suites/rados -- 688 jobs were filtered out. > Job scheduled with name loic-2015-02-28_15:58:07-rados-firefly-backports---basic-multi and ID 783150 > > Creates the http://pulpito.ceph.com/loic-2015-02-28_15:58:07-rados-firefly-backports---basic-multi/ run with just 5 jobs. > > On 28/02/2015 11:28, Loic Dachary wrote: >> Hi, >> >> A teuthology rados run ( https://github.com/ceph/ceph-qa-suite/tree/master/suites/rados ) completed with five dead jobs out of 693. They failed because of DNS errors and I'd like to re-run them. Ideally I could do something like: >> >> teuthology-schedule --run loic-2015-02-27_20:22:09-rados-firefly-backports---basic-multi --job-id 781444 --job-id 781457 ... >> >> and it would re-schedule a run of the designated jobs from the designated run. But I don't think such a command exist. >> >> I will therefore manually do what such a command would do, for each failed job: >> >> * download http://qa-proxy.ceph.com/teuthology/loic-2015-02-27_20:22:09-rados-firefly-backports---basic-multi/781444/orig.config.yaml >> * git clone https://github.com/ceph/ceph-qa-suite /srv/ceph-qa-suite >> * cd /srv/ceph-qa-suite ; git checkout firefly (assuming that's the ceph-qa-suite branch I'm interested in) >> * remove the fields: >> job_id: '781444' >> last_in_suite: false >> worker_log: /var/lib/teuthworker/archive/worker_logs/worker.multi.14588 >> * replace the suite_path: field with suite_path: /srv/ceph-qa-suite >> * teuthology-lock --lock enough machines (i.e. one for each element in the roles: section of the orig.config.yaml) >> * turn the machine list into a consumable file for teuthology : teuthology-lock --list-targets > targets.yaml >> * run teuthology orig.config.yaml targets.yaml >> * wait for the result >> >> Is there a better way to do that ? >> >> Cheers >> > -- Loïc Dachary, Artisan Logiciel Libre
Attachment:
signature.asc
Description: OpenPGP digital signature