Re: Teuthology & Rook (& DeepSea, ceph-ansible, ...)

Gregory Farnum <gfarnum@xxxxxxxxxx> · Wed, 24 Apr 2019 13:34:49 -0700

On Wed, Apr 24, 2019 at 11:11 AM Alfredo Deza <adeza@xxxxxxxxxx> wrote:
>
> On Wed, Apr 24, 2019 at 10:50 AM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
> >
> > Hello Travis, all,
> > I’ve been looking at the interfaces our ceph-qa-suite tasks expect
> > from the underlying teuthology and Ceph deployment tasks to try and
> > 1) narrow them down into something we can implement against other
> > backends (ceph-ansible, Rook, DeepSea, etc)
> > 2) see how those interfaces need to be adapted to suit the differences
> > between physical hosts and kubernetes pods.
>
> I would like to see that not coupled at all. Why does teuthology need
> to know about these? It would be really interesting to see a framework
> that can
> test against a cluster - regardless of how that cluster got there (or
> if its based on containers or baremetal)

Taking over an existing cluster without any knowledge of its setup
isn't practical because we need to manipulate it pretty intrusively in
order to perform our testing. I believe some of our tests involve
turning off an OSD, deliberately breaking its on-disk store, and
turning it back on again. We certainly do plenty of SIGABRT and other
things that require knowledge of the startup process/init system.
That said, I am indeed trying to get to the point where the
ceph-qa-suite tasks do not have any idea how the cluster came to exist
and are just working through a defined interface. The existing
"install" and "ceph" tasks are the pieces that set up that interface,
and the ssh-based run() function is the most widespread "bad" part of
the existing interface, so those are the parts I'm focused on fixing
or working around right now. Plus we've learned from experience we
want to include testing those installers and init systems within our
tests...

> > So I’d like to know how this all sounds. In particular, how
> > implausible is it that we can ssh into Ceph containers and execute
> > arbitrary shell commands?
>
> That is just not going to work in the way teuthology operates. Poking
> at things inside a container depends on the deployment type, for
> example, docker would do something like
> `docker exec` while kubernetes (and openshift) does it a bit differently.
>
> You can't just ssh.

Yes, and for things like invoking Ceph or samba daemons we have good
interfaces to abstract that out. But for things like "run this python
script I've defined in-line to scrape up a piece of data I care about"
there aren't any practical replacements. We can move away from doing
that, but I'd like to explore what our options are before I commit to
either 1) re-writing all of that code or 2) turn off every one of
those tests, as a precondition of testing in Rook.
Anyway I really haven't done much with Kubernetes and I didn't realize
you could just get a shell out of it (I thought it fought pretty hard
to *prevent* that...) so I'll spend some more time looking at it.
-Greg