Following up with some general comments on the main container downsides and on the upsides that led us down this path in the first place. Aside from a few minor misunderstandings, it seems like most of the objections to containers boil down to a few major points: > Containers are more complicated than packages, making debugging harder. I think that part of this comes down to a learning curve and some semi-arbitrary changes to get used to (e.g., systemd unit name has changed; logs now in /var/log/ceph/$fsid instead of /var/log/ceph). Another part of these changes are real hoops to jump through: to inspect process(es) inside a container you have to `cephadm enter --name ...`; ceph CLI may not be automatically installed on every host; stracing or finding coredumps requires extra steps. We're continuing to improve the tools etc so please call these things out as you see them! > Security (50 containers -> 50 versions of openssl to patch) This feels like the most tangible critique. It's a tradeoff. We have had so many bugs over the years due to varying versions of our dependencies that containers feel like a huge win: we can finally test and distribute something that we know won't break due to some random library on some random distro. But it means the Ceph team is on the hook for rebuilding our containers when the libraries inside the container need to be patched. On the flip side, cephadm's use of containers offer some huge wins: - Package installation hell is gone. Previously, ceph-deploy and ceph-ansible had thousands of lines of code to deal with the myriad ways that packages could be installed and where they could be published. With containers, this now boils down to a single string, which is usually just something like "ceph/ceph:v16". We're grown a handful of complexity there to let you log into private registries, but otherwise things are so much simpler. Not to mention what happens when package dependencies break. - Upgrades/downgrades can be carefully orchestrated. With packages, the version change is by host, with a limbo period (and occasional SIGBUS) before daemons were restarted. Now we can run new or patched code on individual daemons and avoid an accidental upgrade when a daemon restarts. (Also, running e.g. ceph CLI commands no longer error out with a dynamic linker error while the package upgrade itself is in progress, something all of our automated upgrade tests have to carefully avoid to prevent intermittent failures.) - Ceph installations are carefully sandboxed. Removing/scrubbing ceph from a host is trivial as only a handful of directories or configuration files are touched. And we can safely run multiple clusters on the same machine without worry about bad interactions (mostly great for development, but also handy for users experimenting with new features etc). - Cephadm deploys a bunch of non-ceph software as well to provide a complete storage system, including haproxy and keepalived for HA ingress for RGW and NFS, ganesha for NFS service, grafana, prometheus, node-exporter, and (soon) samba for SMB. All neatly containerized to avoid bumping into other software on the host; testing and supporting the huge matrix of packages versions available via various distros would be a huge time sink. Most importantly, cephadm and the orchestrator API vastly improve the overall ceph experience from the CLI and dashboard. Users no longer have to give any thought to where and which daemons run if they don't want to (or they can carefully specify daemon placement if they choose). And users can use commands like 'ceph fs volume create foo' and the fs will get created *and* MDS daemons will be started all in one go. (This would also be possible with a package-based orchestrator implementation if one existed.) We've been beat up for years about how complicated and hard Ceph is. Rook and cephadm represent two of the most successful efforts to address usability (and not just because they enable deployment management via the dashboard!), and taking advantage of containers was one expedient way to get to where we needed to go. If users feel strongly about supporting packages, we can get much of the same experience with another package-based orchestrator module. My view, though, is that we have much higher priority problems to tackle. sage _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx