Hi Loic, Based on your feedback, a few action-items emerged for improving this containerized approach to running teuthology jobs: 1. use install-deps.sh for installing dependencies 2. modify the sshd configuration so that the ssh port is specified at runtime via an environment variable. This has the consequence of being able to use --net=host and thus more than one remote can run locally (for jobs with multiple remotes). 3. add an option to provide a sha1 so that the code gets checked out as part of the entrypoint of the container and gets built. 4. write a 'dockerize-config' script for taking a failed job's YAML file and modify it so that it can run with containers. 5. write a 'failed-devenv' script that given a url to a failed job (a) fetches the YAML file (b) runs the dockerize-config script (c) checks out the corresponding sha1 version (d) compiles the code 6. write a 'run-failed-job' that (a) re-builds the code (b) instantiates one container for each specified remote and (c) executes the job. I've implemented 1-3 and am working on 4-6. In short, the goal of all the above is to capture the dev/build/test loop and make it easier to go from 'failed job' to 'working on a fix'. The high-level sequence is (1) run 'make-failed-devenv' so you get the dev environment for the failed job (2) work on a fix and (3) invoke 'run-failed-job' and inspect results (possibly going back to 2 if need it). Thoughts on 4-6? cheers, ivo On Thu, Sep 3, 2015 at 3:23 PM, Loic Dachary <loic@xxxxxxxxxxx> wrote: > > > On 03/09/2015 23:45, Ivo Jimenez wrote:> On Thu, Sep 3, 2015 at 3:09 AM Loic Dachary <loic@xxxxxxxxxxx> wrote: >>> >>>> 2. Initialize a `cephdev` container (the following assumes `$PWD` is >>>> the folder containing the ceph code in your machine): >>>> >>>> ```bash >>>> docker run \ >>>> --name remote0 >>>> -p 2222:22 >>>> -d -e AUTHORIZED_KEYS="`cat ~/.ssh/id_rsa.pub`" \ >>>> -v `pwd`:/ceph \ >>>> -v /dev:/dev \ >>>> -v /tmp/ceph_data/$RANDOM:/var/lib/ceph \ >>>> --cap-add=SYS_ADMIN --privileged \ >>>> --device /dev/fuse >>>> ivotron/cephdev >>>> ``` >>> >>> $PWD is ceph built from sources ? Could you share the dockerfile you used to create ivotron/cephdev ? >> >> >> Yes, the idea is to wrap your ceph folder in a container so that it >> becomes a target for teuthology. The link to the dockerfile: >> >> https://github.com/ivotron/docker-cephdev > > You may want to use install-deps.sh instead of apt-get build-dep to get the packages from sources instead of a presumably older from the source repositories. >> >>> >>>> >>>> Caveats: >>>> >>>> * only a single job can be executed and has to be manually >>>> assembled. I plan to work on supporting suites, which, in short, >>>> implies stripping out the `install` task from existing suites and >>>> leaving only the `install.ship_utilities` subtask instead (the >>>> container image has all the dependencies in it already). >>> >>> Maybe there could be a script to transform config files such as http://qa-proxy.ceph.com/teuthology/loic-2015-09-02_15:41:18-rbd-master---basic-multi/1042448/config.yaml into a config file suitable for this use case ? >> >> >> that's what I have in mind but haven't looked into it yet. I was >> thinking about extending teuthology-suite so that you pass a >> --filter-tasks flag so that we can remove the unwanted tasks, in the >> similar way that --filter leaves some suites out. >> >>> >>> Together with git clone -b $sha1 + make in the container, it would be a nice way to replay / debug a failed job using a single vm and without going through packages. >> >> >> that'd be relatively straight-forward to accomplish, at least the >> docker-side of things (a dockerfile that is given the $SHA1). Prior to >> that, we'd need to have a script that extracts the failed job from >> paddles (does this exist already?), creates a new sha1-predicated > > What do you mean by "extract the failed job" ? Do you expect paddles to have more information than the config.yaml file ( loic-2015-09-02_15:41:18-rbd-master---basic-multi/1042448/config.yaml for instance) ? > >> container and passes the yaml file of the failed job to teuthology >> (which would be invoked with the hypothetical --filter-tasks flag >> mentioned above). > > It's probably more than just filtering out tasks. What about a script that would > > dockerize-config < config.yaml > docker-config.yaml > > and be smart enough to do whatever is necessary to transform an existing config.yaml so that it is suitable to run on docker targets. And fail loudly if it can't ;-) > >> >>> >>>> * I have only tried the above with the `radosbench` and `ceph-fuse` >>>> tasks. Using `--cap-add=ALL` and `-v /lib/modules:/lib/modules` >>>> flags allows a container to load kernel modules so, in principle, >>>> it should work for `rbd` and `kclient` tasks but I haven't tried >>>> it yet. >>>> * For jobs specifying multiple remotes, multiple containers can be >>>> launched (one per remote). While it is possible to run these >>>> on the same docker host, the way ceph daemons dynamically >>>> bind to ports in the 6800-7300 range makes it difficult to >>>> determine which ports to expose from each container (exposing the >>>> same port from multiple containers in the same host is not >>>> allowed, for obvious reasons). So either each remote runs on a >>>> distinct docker host machine, or a deterministic port assignment >>>> is implemented such that, for example, 6800 is always assigned to >>>> osd.0, regardless of where it runs. >>> >>> Would docker run --publish-all=true help ? >> >> >> That option doesn't work with --net=container, which is what we are >> using in this case since we remap sshd's 22 port of the container. In >> other words, for --publish-all to work we need to use --net=host but >> that disables the virtual network that docker provides. An alternative >> would be to configure the base image we're using >> (https://github.com/tutumcloud/tutum-ubuntu/) so that the port that >> sshd uses is passed in an env var. > > Why not use --net=host then ? > >> >>> >>> >>> Clever hack, congrats :-) >> >> >> thanks! >> > > -- > Loïc Dachary, Artisan Logiciel Libre > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html