re-running teuthology jobs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

A teuthology rados run ( https://github.com/ceph/ceph-qa-suite/tree/master/suites/rados ) completed with five dead jobs out of 693. They failed because of DNS errors and I'd like to re-run them. Ideally I could do something like:

teuthology-schedule --run loic-2015-02-27_20:22:09-rados-firefly-backports---basic-multi --job-id 781444 --job-id  781457 ...

and it would re-schedule a run of the designated jobs from the designated run. But I don't think such a command exist. 

I will therefore manually do what such a command would do, for each failed job:

* download http://qa-proxy.ceph.com/teuthology/loic-2015-02-27_20:22:09-rados-firefly-backports---basic-multi/781444/orig.config.yaml
* git clone https://github.com/ceph/ceph-qa-suite /srv/ceph-qa-suite
* cd /srv/ceph-qa-suite ; git checkout firefly (assuming that's the ceph-qa-suite branch I'm interested in)
* remove the fields:
   job_id: '781444'
   last_in_suite: false
   worker_log: /var/lib/teuthworker/archive/worker_logs/worker.multi.14588
* replace the suite_path: field with suite_path: /srv/ceph-qa-suite
* teuthology-lock --lock enough machines (i.e. one for each element in the roles: section of the orig.config.yaml)
* turn the machine list into a consumable file for teuthology : teuthology-lock --list-targets > targets.yaml 
* run teuthology orig.config.yaml targets.yaml
* wait for the result

Is there a better way to do that ? 

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux