On Wed, Apr 24, 2019 at 11:11 AM Alfredo Deza <adeza@xxxxxxxxxx> wrote: > > On Wed, Apr 24, 2019 at 10:50 AM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > > > Hello Travis, all, > > I’ve been looking at the interfaces our ceph-qa-suite tasks expect > > from the underlying teuthology and Ceph deployment tasks to try and > > 1) narrow them down into something we can implement against other > > backends (ceph-ansible, Rook, DeepSea, etc) > > 2) see how those interfaces need to be adapted to suit the differences > > between physical hosts and kubernetes pods. > > I would like to see that not coupled at all. Why does teuthology need > to know about these? It would be really interesting to see a framework > that can > test against a cluster - regardless of how that cluster got there (or > if its based on containers or baremetal) Taking over an existing cluster without any knowledge of its setup isn't practical because we need to manipulate it pretty intrusively in order to perform our testing. I believe some of our tests involve turning off an OSD, deliberately breaking its on-disk store, and turning it back on again. We certainly do plenty of SIGABRT and other things that require knowledge of the startup process/init system. That said, I am indeed trying to get to the point where the ceph-qa-suite tasks do not have any idea how the cluster came to exist and are just working through a defined interface. The existing "install" and "ceph" tasks are the pieces that set up that interface, and the ssh-based run() function is the most widespread "bad" part of the existing interface, so those are the parts I'm focused on fixing or working around right now. Plus we've learned from experience we want to include testing those installers and init systems within our tests... > > So I’d like to know how this all sounds. In particular, how > > implausible is it that we can ssh into Ceph containers and execute > > arbitrary shell commands? > > That is just not going to work in the way teuthology operates. Poking > at things inside a container depends on the deployment type, for > example, docker would do something like > `docker exec` while kubernetes (and openshift) does it a bit differently. > > You can't just ssh. Yes, and for things like invoking Ceph or samba daemons we have good interfaces to abstract that out. But for things like "run this python script I've defined in-line to scrape up a piece of data I care about" there aren't any practical replacements. We can move away from doing that, but I'd like to explore what our options are before I commit to either 1) re-writing all of that code or 2) turn off every one of those tests, as a precondition of testing in Rook. Anyway I really haven't done much with Kubernetes and I didn't realize you could just get a shell out of it (I thought it fought pretty hard to *prevent* that...) so I'll spend some more time looking at it. -Greg