On Wed, Apr 4, 2018 at 12:54 AM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > We identified several under-tested components in the Ceph project. > Several of these consisted of tests that simply weren’t written: > NFS-Ganesha has light testing in RGW, but none with CephFS; Samba’s > testing is very light. > > Significantly more interesting is that none of the > installers/orchestrators/normal process management (Ansible or DeepSea > with systemd; containers under Kubernetes) are currently tested in > teuthology. Changing that is a big desire for most of the integrators, > but is a large project covering both the internal implementation and > testing tasks. Right now, teuthology directly invokes Ceph processes > via ssh and relies on that for control, for checking state (ie, the > process is still running), and for easy logging of issues, and that > has spilled over into important “task" modules such as the thrasher > and cluster managers. There were rumors of individual efforts that > might have been started to enable testing of a normal deployment, but > nobody in the room knew for sure. > PROBLEM TOPIC: support testing orchestration frameworks and the normal > init system in teuthology Correcting the ceph-ansible testing part: We are running ceph-ansible/ceph-deploy testing for quite some time that does systemd testing internally. There is also a systemd task in smoke that tests process explicitly for correctness. a) http://pulpito.ceph.com/?suite=ceph-ansible b) In smoke: https://github.com/ceph/ceph/blob/master/qa/tasks/systemd.py http://pulpito.ceph.com/teuthology-2018-04-04_07:02:02-smoke-master-testing-basic-ovh/2352423 http://pulpito.ceph.com/teuthology-2018-04-04_07:02:02-smoke-master-testing-basic-ovh/2352436 But definitely more work needs to be done to integrate better with thrashers and I am hopeful we will fix this issue soon atleast for some suites: http://tracker.ceph.com/issues/23488 On the container side we dont have any tests and I believe we should start this by fixing the install guide and recommendations so that we can fix in suites. > > Orit also discussed RGW in this context. She noted that RGW has good > coverage of the basic S3 functionality but that more advanced features > tend to miss some tests because they aren’t a good fit for the way we > currently use teuthology. Her specific example was bucket sharding: > trivial tests exist to make sure the commands operate and don’t > immediately break, but actually stressing the sharding code requires > millions of entries with ongoing IO, and dumping that much data into a > cluster simply takes too long to reasonably be part of every suite run > right now. So most testing is infrequent, manual, and ad-hoc. > After discussion we suggested developers should build those tests even > if they can’t be run regularly right now, because they can at least be > run by teams prior to releases and it’s still cheaper and more > reliable to find machine time than make a person do them all. I > committed to discussing with the Ceph Leadership team whether it would > be appropriate to start setting aside a small portion of time in the > sepia lab to regularly do larger-scale tests like this, once they > exist. (We suggested one or two days a month.) > PROBLEM TOPIC: build scale tests in separate suites and reserve lab > time to run them. > > The topic of distribution testing came up briefly. In modern history > the lab has run Ubuntu (one or two LTSes) and CentOS (the latest > release), with a mostly random mix unless your job demanded a specific > OS. I believe Flipkart mentioned Debian as a possible target; we > certainly build packages for Debian and a few other distros that > aren’t tested in the lab at all. But the main issue with adding > distros is that in addition to needing to keep the images up-to-date, > they require (minor) changes to teuthology that we can’t keep alive > without somebody committing to them. In cases where that happens, > we’re happy to bring in new systems: both RHEL and Suse have been > added to the sepia lab and teuthology in the last few months. > A general note: we recently changed to doing a full OS provision on > every test (via FOG), so if you want a random mix of OSes you now need > to specify that. There’s a new teuthology “+” file operator for saying > “select any one of these yaml frags for each test”. > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html